matminer.data_retrieval package¶
Submodules¶
matminer.data_retrieval.retrieve_Citrine module¶
matminer.data_retrieval.retrieve_MP module¶
-
class
matminer.data_retrieval.retrieve_MP.
MPDataRetrieval
(api_key=None)¶ MPDataRetrieval is used to retrieve data from the Materials Project database, print the results, and convert them into an indexed/unindexed Pandas dataframe.
-
__init__
(api_key=None)¶ - Args:
- api_key: (str) Your Materials Project API key, or None if you’ve set the MAPI_KEY environment variable
Returns: None
-
get_dataframe
(criteria, properties, mp_decode=False, index_mpid=True)¶ Gets data from MP in a dataframe format. See API docs at https://materialsproject.org/wiki/index.php/The_Materials_API for more details.
- Args:
- criteria: (str/dict) Criteria of the query as a string or mongo-style dict. If string, it supports a
powerful but simple string criteria. E.g., “Fe2O3” means search for materials with reduced_formula Fe2O3. Wild cards are also supported. E.g., “*2O” means get all materials whose formula can be formed as *2O, e.g., Li2O, K2O, etc.
- Other syntax examples:
- mp-1234: Interpreted as a Materials ID. Fe2O3 or *2O3: Interpreted as reduced formulas. Li-Fe-O or *-Fe-O: Interpreted as chemical systems.
You can mix and match with spaces, which are interpreted as “OR”. E.g. “mp-1234 FeO” means query for all compounds with reduced formula FeO or with materials_id mp-1234.
Using a full dict syntax, even more powerful queries can be constructed. For example, {“elements”:{“$in”:[“Li”, “Na”, “K”], “$all”: [“O”]}, “nelements”:2} selects all Li, Na and K oxides. {“band_gap”: {“$gt”: 1}} selects all materials with band gaps greater than 1 eV.
- properties: (list) Properties to request for as a list. For example,
- [“formula”, “formation_energy_per_atom”] returns the formula and formation energy per atom.
- mp_decode: (bool) Whether to do a decoding to a Pymatgen object where possible. In some cases, it might be
- useful to just get the raw python dict, i.e., set to False.
index_mpid: (bool) Whether to set the materials_id as the dataframe index.
Returns: A Pandas dataframe object
-
matminer.data_retrieval.retrieve_MPDS module¶
matminer.data_retrieval.retrieve_MongoDB module¶
-
class
matminer.data_retrieval.retrieve_MongoDB.
MongoDataRetrieval
(coll)¶ -
__init__
(coll)¶ Tool to retrieve data from a MongoDB collection and reformat for data analysis Args:
coll: A MongoDB collection object
-
get_dataframe
(projection, query=None, sort=None, limit=None, idx_field=None)¶ - Args:
- projection: (list) - a list of str fields to grab; dot-notation is allowed.
- Set to “None” to try to auto-detect the fields.
query: (JSON) - a pymongo-style query to restrict data being gathered sort: (tuple) - pymongo-style sort option limit: (int) - int to limit the number of entries idx_field: (str) - name of field to use as index field (must be unique)
-
-
matminer.data_retrieval.retrieve_MongoDB.
clean_projection
(projection)¶ Projecting on e.g. ‘a.b.’ and ‘a’ is disallowed. Project inclusively. Args:
projection: (list) - list of fields to grab; dot-notation is allowed.
-
matminer.data_retrieval.retrieve_MongoDB.
is_int
(x)¶
-
matminer.data_retrieval.retrieve_MongoDB.
remove_ints
(projection)¶