Converting to IAMC format ------------------------- The IAMC format from the IAM consortium is popular in the energy modelling community. So Friendly data provides workflows to convert a data package to IAMC output with some configuration. The IAMC format allows the user to define their own hierarchy of variables. So when using Friendly data, you can associate specific files to different branches of the hierarchy. There are currently three ways of specifying this: 1. use a fixed string, 2. use a format string with one or more user defined index columns, and 3. define a set of values that are combined and mapped to an IAMC variable. A format string is specified in the index file by adding an ``iamc`` key. If the string contains a column name enclosed in braces, when creating the IAMC file, corresponding values from the index column will be substituted in that position. Let us consider the `example data package`_:: $ tree . ├── annual_cost_per_nameplate_capacity.csv ├── carrier.csv ├── conf.yaml ├── datapackage.json ├── emissions_per_flow_in.csv ├── flow_out_sum.csv ├── index.yaml ├── LICENSE ├── nameplate_capacity.csv ├── README.md └── technology.csv If we consider the dataset ``flow_out_sum.csv``, which looks like: .. csv-table:: Energy flow out :file: _static/data/flow_out_sum.csv :header-rows: 1 The IAMC format requires that the data have the columns: ``model``, ``scenario``, ``region``, ``variable``, ``unit``, and ``value``. If the data is in "long format", then it should also have a column ``year``. In the above dataset, ``locs`` is an alias for ``region``, but there are no columns for ``model``, ``variable``, or ``value``, and there is an additional column called ``techs``. The corresponding entry in the index file looks something like this:: - agg: technology: - values: - wind_onshore - wind_offshore variable: Primary Energy|Wind alias: locs: region techs: technology carriers: carrier iamc: Primary Energy|{technology} idxcols: - scenario - carriers - techs - locs - unit - year path: flow_out_sum.csv The ``alias`` key declares that, ``techs`` is to be treated as ``technology``, and ``locs`` as ``region`` - that satisfies one of the missing columns required by the IAMC specification. You will also note, there is a ``iamc`` key. This mentions ``technology`` in ``{...}``. This is a format string, which means all occurences of ``technology`` are to be replaced by the corresponding values in data. The ``agg`` key also specifies a rule that combines two technologies under a single name. The dataset has ``wind_onshore``, ``wind_offshore``, and ``nuclear``. While ``wind_*`` technologies are summed together, ``nuclear`` is replaced in the format string to form the IAMC variable. The resulting strings are then available under the ``variable`` column. However you will note, the technology names are not particularly descriptive, so you probably want to replace them with something more commonly used in an IAMC dataset. These alternate names can be specified in a separate CSV file, and provided in the configuration file. If we refer to the `example data package`_, we will find a ``conf.yaml`` file, which has a section like this:: indices: technology: technology.csv carrier: carrier.csv model: calliope The above configures technology names to be resolved as per ``technology.csv``, which looks like this: .. csv-table:: Technology definitions :file: _static/data/technology.csv :header-rows: 1 In the same configuration snippet, you can see there's a key for ``model``, but instead of pointing to a file like ``technology``, it specifies a string. If a ``model`` column does not exist in your dataset, this string will be taken as the default value for such a column. This leaves only the ``value`` column, which is nothing but the data column, in our example that is ``flow_out_sum``. And we have our data in IAMC format! .. csv-table:: Data in IAMC format :file: _static/data/iamc.csv :header-rows: 1 This kind of replacement from values in the dataset can de done with multiple columns, e.g. the index entry for ``nameplate_capacity.csv`` looks like this:: - agg: technology: - values: - wind_onshore - wind_offshore variable: Capacity|Electricity|Wind alias: locs: region techs: technology carriers: carrier iamc: Capacity|{carrier}|{technology} idxcols: - scenario - carriers - techs - locs - unit - year path: nameplate_capacity.csv Here, all possible combinations of ``technology`` and ``carrier`` will be tried, and only the ones present in the data will be included in the final output. If you do not need replacement from data, you can always use a regular string (without any ``{...}``) to denote what should be in the ``variable`` column (see the `example data package`_ for other examples). .. _`example data package`: https://github.com/sentinel-energy/friendly_data_example