wd_utils¶
The data.wd_utils
module provides utility functions for accessing and storing Wikidata information.
Functions
wikirepo.data.wd_utils.check_stget_propr_similarity()
wikirepo.data.wd_utils.get_formatted_prop_start_end_t()
wikirepo.data.wd_utils.prop_start_end_to_timespan()
wikirepo.data.wd_utils.get_prop_timespan()
wikirepo.data.wd_utils.check_for_pid_sub_page()
Classes
- wikirepo.data.wd_utils.check_in_ents_dict(ents_dict, qid)[source]¶
Checks an the provided entity dictionary and adds to it if not present.
- wikirepo.data.wd_utils.prop_has_many_entries(prop_ent)[source]¶
Check if a Wikidata entry has multiple values for a given property.
- wikirepo.data.wd_utils.get_lbl(ents_dict=None, pq_id=None)[source]¶
Gets an English label of a Wikidata entity.
- wikirepo.data.wd_utils.get_prop_id(ents_dict, qid, pid, i)[source]¶
Gets the qid of an indexed property label of a Wikidata entity.
- wikirepo.data.wd_utils.get_prop_lbl(ents_dict, qid, pid, i)[source]¶
Gets a label of an indexed property label of a Wikidata entity.
- wikirepo.data.wd_utils.get_prop_val(ents_dict, qid, pid, i, ignore_char='')[source]¶
Gets a values of an indexed property label of a Wikidata entity.
- wikirepo.data.wd_utils.prop_has_qualifiers(ents_dict, qid, pid, i)[source]¶
Checks if the property has qualifiers.
- wikirepo.data.wd_utils.get_qualifiers(ents_dict, qid, pid, i)[source]¶
Gets the qualifiers of a property of a Wikidata entity.
- wikirepo.data.wd_utils.get_prop_qualifier_val(ents_dict, qid, pid, sub_pid, i, ignore_char='')[source]¶
Gets a values of an indexed qualifier property label of a Wikidata entity.
- wikirepo.data.wd_utils.get_val(ents_dict, qid, pid, sub_pid, i, ignore_char='')[source]¶
Combines get_prop_val, get_prop_qualifier_val, and boolean assignment.
- wikirepo.data.wd_utils.get_prop_t(pid, i)[source]¶
Gets a value of ‘P585’ (point in time) from a Wikidata property.
- wikirepo.data.wd_utils.get_prop_start_t(pid, i)[source]¶
Gets a value of ‘P580’ (start time) from a Wikidata property.
- wikirepo.data.wd_utils.get_prop_end_t(pid, i)[source]¶
Gets a value of ‘P582’ (end time) from a Wikidata property.
- wikirepo.data.wd_utils.get_formatted_prop_t(ents_dict, qid, pid, i)[source]¶
Gets the formatted ‘P585’ (point in time) from a Wikidata property.
- wikirepo.data.wd_utils.get_formatted_prop_start_t(ents_dict, qid, pid, i)[source]¶
Gets the formatted ‘P580’ (start time) from a Wikidata property.
- wikirepo.data.wd_utils.get_formatted_prop_end_t(ents_dict, qid, pid, i)[source]¶
Gets the formatted ‘P582’ (end time) from a Wikidata property.
- wikirepo.data.wd_utils.get_prop_timespan_intersection(ents_dict, qid, pid, i, timespan, interval)[source]¶
Combines get_formatted_prop_start_end_t and prop_start_end_to_timespan.
- wikirepo.data.wd_utils.dir_to_topic_page(dir_name=None, ents_dict=None, qid=None)[source]¶
Allows for the checking of subject entities for a given QID.
- Parameters:
- dir_namestr (default=None)
The name of the directory within wikirepo.data.
- ents_dictwd_utils.EntitiesDict (default=None)
A dictionary with keys being Wikidata QIDs and values being their entities.
- qidstr (default=None)
Wikidata QID for a location.
- Returns:
- topic_qid or Nonestr or None
The qid for an existing topic for the location or None to cancel later steps.
- wikirepo.data.wd_utils.t_to_prop_val_dict(dir_name=None, ents_dict=None, qids=None, pid=None, sub_pid=None, interval=None, timespan=None, ignore_char='', span=False)[source]¶
Gets a dictionary of property value(s) indexed by time(s) from a locational entity.
- Parameters:
- dir_namestr (default=None)
The name of the directory within wikirepo.data.
- ents_dictwd_utils.EntitiesDict (default=None)
A dictionary with keys being Wikidata QIDs and values being their entities.
- qidsstr or list (contains strs) (default=None)
Wikidata QIDs for locations.
- pidstr (default=None)
The Wikidata property that is being queried.
- sub_pidstr (default=None)
The Wikidata property that subsets time values.
- timespantwo element tuple or listcontains datetime.date or tuple (default=None: (date.today(), date.today()))
A tuple or list that defines the start and end dates to be queried.
Note 1: if True, then the full timespan from 1-1-1 to the current day will be queried.
Note 2: passing a single entry will query for that date only.
- intervalstr (default=None)
The time interval over which queries will be made.
Note 1: see data.time_utils for options.
Note 2: if None, then only the most recent data will be queried.
- ignore_charstr (default=’’, no character to ignore)
Characters in the output that should be ignored.
- spanbool (default=False)
Whether to check for P580 ‘start time’ and P582 ‘end time’ to create spans.
- Returns:
- t_prop_dictdict
A dictionary of Wikidata properties indexed by their time.
Notes
Used to assign property values to a single column (values cannot have the same time value).
- wikirepo.data.wd_utils.t_to_prop_val_dict_dict(dir_name=None, ents_dict=None, qids=None, pid=None, sub_pid=None, interval=None, timespan=None, ignore_char='', span=False)[source]¶
Gets a dictionary of dictionaries of multiple property values that are indexed by time(s) from a locational entity.
- Parameters:
- dir_namestr (default=None)
The name of the directory within wikirepo.data.
- ents_dictwd_utils.EntitiesDict (default=None)
A dictionary with keys being Wikidata QIDs and values being their entities.
- qidsstr or list (contains strs) (default=None)
Wikidata QIDs for locations.
- pidstr (default=None)
The Wikidata property that is being queried.
- sub_pidstr (default=None)
The Wikidata property that subsets time values.
- timespantwo element tuple or listcontains datetime.date or tuple (default=None: (date.today(), date.today()))
A tuple or list that defines the start and end dates to be queried.
Note 1: if True, then the full timespan from 1-1-1 to the current day will be queried.
Note 2: passing a single entry will query for that date only.
- intervalstr (default=None)
The time interval over which queries will be made.
Note 1: see data.time_utils for options.
Note 2: if None, then only the most recent data will be queried.
- ignore_charstr (default=’’, no character to ignore)
Characters in the output that should be ignored.
- spanbool (default=False)
Whether to check for P580 ‘start time’ and P582 ‘end time’ to create spans.
- Returns:
- t_prop_dictdict
A dictionary of Wikidata properties indexed by their time.
Notes
Used to assign property values to separate columns (values can have the same time value)