For convenience, documentation for selected sections of the
frictionless
API has been included below. For the upstream
documentation, see Package and Resource.
Package
- class frictionless.package.Package(source=None, *, descriptor=None, resources=None, id=None, name=None, title=None, description=None, licenses=None, sources=None, profile=None, homepage=None, version=None, contributors=None, keywords=None, image=None, created=None, innerpath='datapackage.json', basepath='', detector=None, onerror='ignore', trusted=False, hashing=None)[source]
Package representation
API | Usage ——– | ——– Public | from frictionless import Package
This class is one of the cornerstones of of Frictionless framework. It manages underlaying resource and provides an ability to describe a package.
```python package = Package(resources=[Resource(path=”data/table.csv”)]) package.get_resoure(‘table’).read_rows() == [
{‘id’: 1, ‘name’: ‘english’}, {‘id’: 2, ‘name’: ‘中国人’},
Parameters:
- source (any): Source of the package; can be in various forms.
Usually, it’s a package descriptor in a form of dict or path Also, it can be a glob pattern or a resource path
- descriptor (dict|str): A resource descriptor provided explicitly.
Keyword arguments will patch this descriptor if provided.
- resources? (dict|Resource[]): A list of resource descriptors.
It can be dicts or Resource instances.
- id? (str): A property reserved for globally unique identifiers.
Examples of identifiers that are unique include UUIDs and DOIs.
- name? (str): A short url-usable (and preferably human-readable) name.
This MUST be lower-case and contain only alphanumeric characters along with “.”, “_” or “-” characters.
- title? (str): A Package title according to the specs
It should a human-oriented title of the resource.
- description? (str): A Package description according to the specs
It should a human-oriented description of the resource.
- licenses? (dict[]): The license(s) under which the package is provided.
If omitted it’s considered the same as the package’s licenses.
- sources? (dict[]): The raw sources for this data package.
It MUST be an array of Source objects. Each Source object MUST have a title and MAY have path and/or email properties.
- profile? (str): A string identifying the profile of this descriptor.
For example, fiscal-data-package.
- homepage? (str): A URL for the home on the web that is related to this package.
For example, github repository or ckan dataset address.
- version? (str): A version string identifying the version of the package.
It should conform to the Semantic Versioning requirements and should follow the Data Package Version pattern.
- contributors? (dict[]): The people or organizations who contributed to this package.
It MUST be an array. Each entry is a Contributor and MUST be an object. A Contributor MUST have a title property and MAY contain path, email, role and organization properties.
- keywords? (str[]): An Array of string keywords to assist users searching.
For example, [‘data’, ‘fiscal’]
- image? (str): An image to use for this data package.
For example, when showing the package in a listing.
- created? (str): The datetime on which this was created.
The datetime must conform to the string formats for RFC3339 datetime,
- innerpath? (str): A ZIP datapackage descriptor inner path.
Path to the package descriptor inside the ZIP datapackage. Example: some/folder/datapackage.yaml Default: datapackage.json
- basepath? (str): A basepath of the resource
The fullpath of the resource is joined basepath and /path`
- detector? (Detector): File/table detector.
For more information, please check the Detector documentation.
- onerror? (ignore|warn|raise): Behaviour if there is an error.
It defaults to ‘ignore’. The default mode will ignore all errors on resource level and they should be handled by the user being available in Header and Row objects.
- trusted? (bool): Don’t raise an exception on unsafe paths.
A path provided as a part of the descriptor considered unsafe if there are path traversing or the path is absolute. A path provided as source or path is alway trusted.
- hashing? (str): a hashing algorithm for resources
It defaults to ‘md5’.
- Raises:
FrictionlessException: raise any error that occurs during the process
- Attributes
basepath
Returns:
description_html
Returns:
hashing
Returns:
metadata_errors
Returns:
metadata_valid
Returns:
onerror
Returns:
resource_names
Returns:
trusted
Returns:
Methods
add_resource
([source])Add new resource to the package.
clear
()copy
()expand
()Expand metadata
from_bigquery
(source, *[, dialect])Import package from Bigquery
from_ckan
(source, *[, dialect])Import package from CKAN
from_sql
(source, *[, dialect])Import package from SQL
from_zip
(path, **options)Create a package from ZIP
fromkeys
(iterable[, value])Create a new dictionary with keys from iterable and values set to value.
get
(key[, default])Return the value for key if key is in the dictionary, else default.
get_resource
(name)Get resource by name.
has_resource
(name)Check if a resource is present
infer
(*[, stats])Infer package's attributes
items
()keys
()metadata_attach
(name, value)Helper method for attaching a value to the metadata
metadata_extract
(descriptor)Helper method called during the metadata extraction
Helper method called on any metadata change
Helper method called on any metadata change
pop
(k[,d])If key is not found, default is returned if given, otherwise KeyError is raised
popitem
(*args, **kwargs)Remove and return a (key, value) pair as a 2-tuple.
property
([func, cache, reset, write])Create a metadata property
remove_resource
(name)Remove resource by name.
setdefault
(*args, **kwargs)Insert key with a value of default if key is not in the dictionary.
setinitial
(key, value)Set an initial item in a subclass' constructor
to_bigquery
(target, *[, dialect])Export package to Bigquery
to_ckan
(target, *[, dialect])Export package to CKAN
to_copy
()Create a copy of the package
to_dict
()Convert metadata to a plain dict
to_json
([path, encoder_class])Save metadata as a json
to_sql
(target, *[, dialect])Export package to SQL
to_yaml
([path])Save metadata as a yaml
to_zip
(path, *[, encoder_class])Save package to a zip
update
([E, ]**F)If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
values
()metadata_Error
- add_resource(source=None, **options)[source]
Add new resource to the package.
- Parameters:
source (dict|str): a data source **options (dict): options of the Resource class
- Returns:
Resource/None: added Resource instance or None if not added
- property basepath
- Returns:
str: package basepath
- contributors
- Returns:
dict[]: package contributors
- created
- Returns:
str: package created
- description
- Returns:
str: package description
- property description_html
- Returns:
str: package description
- description_text
- Returns:
str: package description
- static from_bigquery(source, *, dialect=None)[source]
Import package from Bigquery
- Parameters:
source (string): BigQuery Service object dialect (dict): BigQuery dialect
- Returns:
Package: package
- static from_ckan(source, *, dialect=None)[source]
Import package from CKAN
- Parameters:
source (string): CKAN instance url e.g. “https://demo.ckan.org” dialect (dict): CKAN dialect
- Returns:
Package: package
- static from_sql(source, *, dialect=None)[source]
Import package from SQL
- Parameters:
source (any): SQL connection string of engine dialect (dict): SQL dialect
- Returns:
Package: package
- static from_zip(path, **options)[source]
Create a package from ZIP
- Parameters:
path(str): file path **options(dict): resouce options
- get_resource(name)[source]
Get resource by name.
- Parameters:
name (str): resource name
- Raises:
FrictionlessException: if resource is not found
- Returns:
Resource/None: Resource instance or None if not found
- has_resource(name)[source]
Check if a resource is present
- Parameters:
name (str): schema resource name
- Returns:
bool: whether there is the resource
- property hashing
- Returns:
str: package hashing
- homepage
- Returns:
str: package homepage
- id
- Returns:
str: package id
- image
- Returns:
str: package image
- infer(*, stats=False)[source]
Infer package’s attributes
- Parameters:
stats? (bool): stream files completely and infer stats
- keywords
- Returns:
str[]: package keywords
- licenses
- Returns:
dict[]: package licenses
- metadata_Error
alias of
frictionless.errors.general.PackageError
- metadata_validate()[source]
Helper method called on any metadata change
- Parameters:
profile (dict): a profile to validate against of
- name
- Returns:
str: package name
- property onerror
- Returns:
ignore|warn|raise: on error bahaviour
- profile
- Returns:
str: package profile
- remove_resource(name)[source]
Remove resource by name.
- Parameters:
name (str): resource name
- Raises:
FrictionlessException: if resource is not found
- Returns:
Resource/None: removed Resource instances or None if not found
- property resource_names
- Returns:
str[]: package resource names
- resources
- Returns:
Resources[]: package resource
- sources
- Returns:
dict[]: package sources
- title
- Returns:
str: package title
- to_bigquery(target, *, dialect=None)[source]
Export package to Bigquery
- Parameters:
target (string): BigQuery Service object dialect (dict): BigQuery dialect
- Returns:
BigqueryStorage: storage
- to_ckan(target, *, dialect=None)[source]
Export package to CKAN
- Parameters:
target (string): CKAN instance url e.g. “https://demo.ckan.org” dialect (dict): CKAN dialect
- Returns:
CkanStorage: storage
- to_sql(target, *, dialect=None)[source]
Export package to SQL
- Parameters:
target (any): SQL connection string of engine dialect (dict): SQL dialect
- Returns:
SqlStorage: storage
- to_zip(path, *, encoder_class=None)[source]
Save package to a zip
- Parameters:
path (str): target path encoder_class (object): json encoder class
- Raises:
FrictionlessException: on any error
- property trusted
- Returns:
str: package trusted
- version
- Returns:
str: package version
Resource
- class frictionless.resource.Resource(source=None, *, descriptor=None, name=None, title=None, description=None, mediatype=None, licenses=None, sources=None, profile=None, path=None, data=None, scheme=None, format=None, hashing=None, encoding=None, innerpath=None, compression=None, control=None, dialect=None, layout=None, schema=None, stats=None, basepath='', detector=None, onerror='ignore', trusted=False, package=None)[source]
Resource representation.
API | Usage ——– | ——– Public | from frictionless import Resource
This class is one of the cornerstones of of Frictionless framework. It loads a data source, and allows you to stream its parsed contents. At the same time, it’s a metadata class data description.
```python with Resource(“data/table.csv”) as resource:
resource.header == [“id”, “name”] resource.read_rows() == [
{‘id’: 1, ‘name’: ‘english’}, {‘id’: 2, ‘name’: ‘中国人’},
]
Parameters:
- source (any): Source of the resource; can be in various forms.
Usually, it’s a string as <scheme>://path/to/file.<format>. It also can be, for example, an array of data arrays/dictionaries. Or it can be a resource descriptor dict or path.
- descriptor (dict|str): A resource descriptor provided explicitly.
Keyword arguments will patch this descriptor if provided.
- name? (str): A Resource name according to the specs.
It should be a slugified name of the resource.
- title? (str): A Resource title according to the specs
It should a human-oriented title of the resource.
- description? (str): A Resource description according to the specs
It should a human-oriented description of the resource.
- mediatype? (str): A mediatype/mimetype of the resource e.g. “text/csv”,
or “application/vnd.ms-excel”. Mediatypes are maintained by the Internet Assigned Numbers Authority (IANA) in a media type registry.
- licenses? (dict[]): The license(s) under which the resource is provided.
If omitted it’s considered the same as the package’s licenses.
- sources? (dict[]): The raw sources for this data resource.
It MUST be an array of Source objects. Each Source object MUST have a title and MAY have path and/or email properties.
- profile? (str): A string identifying the profile of this descriptor.
For example, tabular-data-resource.
- scheme? (str): Scheme for loading the file (file, http, …).
If not set, it’ll be inferred from source.
- format? (str): File source’s format (csv, xls, …).
If not set, it’ll be inferred from source.
- hashing? (str): An algorithm to hash data.
It defaults to ‘md5’.
- encoding? (str): Source encoding.
If not set, it’ll be inferred from source.
- innerpath? (str): A path within the compressed file.
It defaults to the first file in the archive.
- compression? (str): Source file compression (zip, …).
If not set, it’ll be inferred from source.
- control? (dict|Control): File control.
For more information, please check the Control documentation.
- dialect? (dict|Dialect): Table dialect.
For more information, please check the Dialect documentation.
- layout? (dict|Layout): Table layout.
For more information, please check the Layout documentation.
- schema? (dict|Schema): Table schema.
For more information, please check the Schema documentation.
- stats? (dict): File/table stats.
A dict with the following possible properties: hash, bytes, fields, rows.
- basepath? (str): A basepath of the resource
The fullpath of the resource is joined basepath and /path`
- detector? (Detector): File/table detector.
For more information, please check the Detector documentation.
- onerror? (ignore|warn|raise): Behaviour if there is an error.
It defaults to ‘ignore’. The default mode will ignore all errors on resource level and they should be handled by the user being available in Header and Row objects.
- trusted? (bool): Don’t raise an exception on unsafe paths.
A path provided as a part of the descriptor considered unsafe if there are path traversing or the path is absolute. A path provided as source or path is alway trusted.
- package? (Package): A owning this resource package.
It’s actual if the resource is part of some data package.
- Raises:
FrictionlessException: raise any error that occurs during the process
- Attributes
basepath
Returns
buffer
File’s bytes used as a sample
byte_stream
Byte stream in form of a generator
closed
Whether the table is closed
description_html
Returns:
detector
Returns
fragment
Table’s lists used as fragment.
fullpath
Returns
header
Returns:
labels
Returns:
list_stream
List stream in form of a generator
metadata_errors
Returns:
metadata_valid
Returns:
onerror
Returns:
package
Returns:
row_stream
Row stream in form of a generator of Row objects
sample
Table’s lists used as sample.
text_stream
Text stream in form of a generator
trusted
Returns:
Methods
clear
()close
()Close the table as "filelike.close" does
copy
()expand
()Expand metadata
from_petl
(view, **options)Create a resource from PETL view
fromkeys
(iterable[, value])Create a new dictionary with keys from iterable and values set to value.
get
(key[, default])Return the value for key if key is in the dictionary, else default.
infer
(*[, stats])Infer metadata
items
()keys
()metadata_attach
(name, value)Helper method for attaching a value to the metadata
metadata_extract
(descriptor)Helper method called during the metadata extraction
Helper method called on any metadata change
Helper method called on any metadata change
open
()Open the resource as "io.open" does
pop
(k[,d])If key is not found, default is returned if given, otherwise KeyError is raised
popitem
(*args, **kwargs)Remove and return a (key, value) pair as a 2-tuple.
property
([func, cache, reset, write])Create a metadata property
read_bytes
(*[, size])Read bytes into memory
read_data
(*[, size])Read data into memory
read_lists
(*[, size])Read lists into memory
read_rows
(*[, size])Read rows into memory
read_text
(*[, size])Read text into memory
setdefault
(*args, **kwargs)Insert key with a value of default if key is not in the dictionary.
setinitial
(key, value)Set an initial item in a subclass' constructor
to_copy
(**options)Create a copy from the resource
to_dict
()Create a dict from the resource
to_inline
(*[, dialect])Helper to export resource as an inline data
to_json
([path, encoder_class])Save metadata as a json
to_pandas
(*[, dialect])Helper to export resource as an Pandas dataframe
to_petl
([normalize])Export resource as a PETL table
to_snap
(*[, json])Create a snapshot from the resource
to_view
([type])Create a view from the resource
to_yaml
([path])Save metadata as a yaml
update
([E, ]**F)If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]
values
()write
([target])Write this resource to the target resource
metadata_Error
- property basepath
- Returns
str: resource basepath
- property buffer
File’s bytes used as a sample
These buffer bytes are used to infer characteristics of the source file (e.g. encoding, …).
- Returns:
bytes?: file buffer
- property byte_stream
Byte stream in form of a generator
- Yields:
gen<bytes>?: byte stream
- property closed
Whether the table is closed
- Returns:
bool: if closed
- compression
- Returns
str: resource compression
- control
- Returns
Control: resource control
- data
- Returns
any[][]?: resource data
- description
- Returns
str: resource description
- property description_html
- Returns:
str?: resource description
- description_text
- Returns:
str: resource description
- property detector
- Returns
str: resource detector
- dialect
- Returns
Dialect: resource dialect
- encoding
- Returns
str: resource encoding
- format
- Returns
str: resource format
- property fragment
Table’s lists used as fragment.
These fragment rows are used internally to infer characteristics of the source file (e.g. schema, …).
- Returns:
list[]?: table fragment
- property fullpath
- Returns
str: resource fullpath
- hashing
- Returns
str: resource hashing
- property header
- Returns:
str[]?: table header
- infer(*, stats=False)[source]
Infer metadata
- Parameters:
stats? (bool): stream file completely and infer stats
- innerpath
- Returns
str: resource compression path
- property labels
- Returns:
str[]?: table labels
- layout
- Returns:
Layout: table layout
- licenses
- Returns
dict[]: resource licenses
- property list_stream
List stream in form of a generator
- Yields:
gen<any[][]>?: list stream
- mediatype
- Returns
str: resource mediatype
- metadata_Error
alias of
frictionless.errors.general.ResourceError
- metadata_validate()[source]
Helper method called on any metadata change
- Parameters:
profile (dict): a profile to validate against of
- name
- Returns
str: resource name
- property onerror
- Returns:
ignore|warn|raise: on error bahaviour
- open()[source]
Open the resource as “io.open” does
- Raises:
FrictionlessException: any exception that occurs
- property package
- Returns:
Package?: parent package
- path
- Returns
str: resource path
- profile
- Returns
str: resource profile
- property row_stream
Row stream in form of a generator of Row objects
- Yields:
gen<Row[]>?: row stream
- property sample
Table’s lists used as sample.
These sample rows are used to infer characteristics of the source file (e.g. schema, …).
- Returns:
list[]?: table sample
- schema
- Returns
Schema: resource schema
- scheme
- Returns
str: resource scheme
- sources
- Returns
dict[]: resource sources
- stats
- Returns
dict: resource stats
- tabular
- Returns
bool: if resource is tabular
- property text_stream
Text stream in form of a generator
- Yields:
gen<str[]>?: text stream
- title
- Returns
str: resource title
- to_snap(*, json=False)[source]
Create a snapshot from the resource
- Parameters:
json (bool): make data types compatible with JSON format
- Returns
list: resource’s data
- to_view(type='look', **options)[source]
Create a view from the resource
See PETL’s docs for more information: https://petl.readthedocs.io/en/stable/util.html#visualising-tables
- Parameters:
type (look|lookall|see|display|displayall): view’s type **options (dict): options to be passed to PETL
- Returns
str: resource’s view
- property trusted
- Returns:
bool: don’t raise an exception on unsafe paths