Converting to IAMC format
The IAMC format from the IAM consortium is popular in the energy modelling community. So Friendly data provides workflows to convert a data package to IAMC output with some configuration.
The IAMC format allows the user to define their own hierarchy of variables. So when using Friendly data, you can associate specific files to different branches of the hierarchy. There are currently three ways of specifying this: 1. use a fixed string, 2. use a format string with one or more user defined index columns, and 3. define a set of values that are combined and mapped to an IAMC variable.
A format string is specified in the index file by adding an iamc
key. If the string contains a column name enclosed in braces, when
creating the IAMC file, corresponding values from the index column
will be substituted in that position.
Let us consider the example data package:
$ tree
.
├── annual_cost_per_nameplate_capacity.csv
├── carrier.csv
├── conf.yaml
├── datapackage.json
├── emissions_per_flow_in.csv
├── flow_out_sum.csv
├── index.yaml
├── LICENSE
├── nameplate_capacity.csv
├── README.md
└── technology.csv
If we consider the dataset flow_out_sum.csv
, which looks like:
scenario |
techs |
locs |
carriers |
unit |
year |
flow_out_sum |
---|---|---|---|---|---|---|
diag-npi |
wind_offshore |
DEU |
electricity |
twh |
2030 |
0.0026985550026472 |
diag-npi |
wind_offshore |
DNK |
electricity |
twh |
2030 |
0.0014073819977408 |
… |
… |
… |
… |
… |
… |
|
diag-npi |
wind_onshore |
CHE |
electricity |
twh |
2030 |
0.0007493784045182 |
diag-npi |
wind_onshore |
DEU |
electricity |
twh |
2030 |
0.0258391578821039 |
… |
… |
… |
… |
… |
… |
|
diag-npi |
nuclear |
CHE |
electricity |
twh |
2030 |
62.78803794129926 |
diag-npi |
nuclear |
DEU |
electricity |
twh |
2030 |
224.96177256922013 |
The IAMC format requires that the data have the columns: model
,
scenario
, region
, variable
, unit
, and value
. If
the data is in “long format”, then it should also have a column
year
. In the above dataset, locs
is an alias for region
,
but there are no columns for model
, variable
, or value
,
and there is an additional column called techs
.
The corresponding entry in the index file looks something like this:
- agg:
technology:
- values:
- wind_onshore
- wind_offshore
variable: Primary Energy|Wind
alias:
locs: region
techs: technology
carriers: carrier
iamc: Primary Energy|{technology}
idxcols:
- scenario
- carriers
- techs
- locs
- unit
- year
path: flow_out_sum.csv
The alias
key declares that, techs
is to be treated as
technology
, and locs
as region
- that satisfies one of the
missing columns required by the IAMC specification. You will also
note, there is a iamc
key. This mentions technology
in
{...}
. This is a format string, which means all occurences of
technology
are to be replaced by the corresponding values in data.
The agg
key also specifies a rule that combines two technologies
under a single name. The dataset has wind_onshore
,
wind_offshore
, and nuclear
. While wind_*
technologies are
summed together, nuclear
is replaced in the format string to form
the IAMC variable. The resulting strings are then available under the
variable
column. However you will note, the technology names are
not particularly descriptive, so you probably want to replace them
with something more commonly used in an IAMC dataset. These alternate
names can be specified in a separate CSV file, and provided in the
configuration file. If we refer to the example data package, we
will find a conf.yaml
file, which has a section like this:
indices:
technology: technology.csv
carrier: carrier.csv
model: calliope
The above configures technology names to be resolved as per
technology.csv
, which looks like this:
name |
iamc |
---|---|
nuclear |
Nuclear |
wind_offshore |
Wind|Offshore |
wind_onshore |
Wind|Onshore |
In the same configuration snippet, you can see there’s a key for
model
, but instead of pointing to a file like technology
, it
specifies a string. If a model
column does not exist in your
dataset, this string will be taken as the default value for such a
column. This leaves only the value
column, which is nothing but
the data column, in our example that is flow_out_sum
. And we have
our data in IAMC format!
model |
scenario |
region |
variable |
unit |
year |
value |
---|---|---|---|---|---|---|
calliope |
diag-npi |
CHE |
Fixed Cost|Electricity|Wind |
billion_2015eur_per_tw_per_year |
2030 |
47.7515 |
calliope |
diag-npi |
DEU |
Fixed Cost|Electricity|Wind |
billion_2015eur_per_tw_per_year |
2030 |
47.75149999999999 |
… |
… |
… |
… |
… |
… |
|
calliope |
diag-npi |
CHE |
Fixed Cost|Electricity|Nuclear |
billion_2015eur_per_tw_per_year |
2030 |
76.116 |
calliope |
diag-npi |
DEU |
Fixed Cost|Electricity|Nuclear |
billion_2015eur_per_tw_per_year |
2030 |
76.116 |
This kind of replacement
from values in the dataset can de done with multiple columns, e.g. the
index entry for nameplate_capacity.csv
looks like this:
- agg:
technology:
- values:
- wind_onshore
- wind_offshore
variable: Capacity|Electricity|Wind
alias:
locs: region
techs: technology
carriers: carrier
iamc: Capacity|{carrier}|{technology}
idxcols:
- scenario
- carriers
- techs
- locs
- unit
- year
path: nameplate_capacity.csv
Here, all possible combinations of technology
and carrier
will
be tried, and only the ones present in the data will be included in
the final output. If you do not need replacement from data, you can
always use a regular string (without any {...}
) to denote what
should be in the variable
column (see the example data package
for other examples).