wd_utils

The data.wd_utils module provides utility functions for accessing and storing Wikidata information.

Functions

Classes

wikirepo.data.wd_utils.check_in_ents_dict(ents_dict, qid)[source]

Checks an the provided entity dictionary and adds to it if not present.

wikirepo.data.wd_utils.load_ent(ents_dict, pq_id)[source]

Loads an entity.

wikirepo.data.wd_utils.is_wd_id(var)[source]

Checks whether a variable is a Wikidata id.

wikirepo.data.wd_utils.prop_has_many_entries(prop_ent)[source]

Check if a Wikidata entry has multiple values for a given property.

wikirepo.data.wd_utils.get_lbl(ents_dict=None, pq_id=None)[source]

Gets an English label of a Wikidata entity.

wikirepo.data.wd_utils.get_prop_id(ents_dict, qid, pid, i)[source]

Gets the qid of an indexed property label of a Wikidata entity.

wikirepo.data.wd_utils.get_prop_lbl(ents_dict, qid, pid, i)[source]

Gets a label of an indexed property label of a Wikidata entity.

wikirepo.data.wd_utils.get_prop_val(ents_dict, qid, pid, i, ignore_char='')[source]

Gets a values of an indexed property label of a Wikidata entity.

wikirepo.data.wd_utils.prop_has_qualifiers(ents_dict, qid, pid, i)[source]

Checks if the property has qualifiers.

wikirepo.data.wd_utils.get_qualifiers(ents_dict, qid, pid, i)[source]

Gets the qualifiers of a property of a Wikidata entity.

wikirepo.data.wd_utils.get_prop_qualifier_val(ents_dict, qid, pid, sub_pid, i, ignore_char='')[source]

Gets a values of an indexed qualifier property label of a Wikidata entity.

wikirepo.data.wd_utils.get_val(ents_dict, qid, pid, sub_pid, i, ignore_char='')[source]

Combines get_prop_val, get_prop_qualifier_val, and boolean assignment.

wikirepo.data.wd_utils.get_prop_t(pid, i)[source]

Gets a value of ‘P585’ (point in time) from a Wikidata property.

wikirepo.data.wd_utils.get_prop_start_t(pid, i)[source]

Gets a value of ‘P580’ (start time) from a Wikidata property.

wikirepo.data.wd_utils.get_prop_end_t(pid, i)[source]

Gets a value of ‘P582’ (end time) from a Wikidata property.

wikirepo.data.wd_utils.format_t(t)[source]

Formats the date strings of a Wikidata entry.

wikirepo.data.wd_utils.get_formatted_prop_t(ents_dict, qid, pid, i)[source]

Gets the formatted ‘P585’ (point in time) from a Wikidata property.

wikirepo.data.wd_utils.get_formatted_prop_start_t(ents_dict, qid, pid, i)[source]

Gets the formatted ‘P580’ (start time) from a Wikidata property.

wikirepo.data.wd_utils.get_formatted_prop_end_t(ents_dict, qid, pid, i)[source]

Gets the formatted ‘P582’ (end time) from a Wikidata property.

wikirepo.data.wd_utils.get_prop_timespan_intersection(ents_dict, qid, pid, i, timespan, interval)[source]

Combines get_formatted_prop_start_end_t and prop_start_end_to_timespan.

wikirepo.data.wd_utils.dir_to_topic_page(dir_name=None, ents_dict=None, qid=None)[source]

Allows for the checking of subject entities for a given QID.

Parameters:
dir_namestr (default=None)

The name of the directory within wikirepo.data.

ents_dictwd_utils.EntitiesDict (default=None)

A dictionary with keys being Wikidata QIDs and values being their entities.

qidstr (default=None)

Wikidata QID for a location.

Returns:
topic_qid or Nonestr or None

The qid for an existing topic for the location or None to cancel later steps.

wikirepo.data.wd_utils.t_to_prop_val_dict(dir_name=None, ents_dict=None, qids=None, pid=None, sub_pid=None, interval=None, timespan=None, ignore_char='', span=False)[source]

Gets a dictionary of property value(s) indexed by time(s) from a locational entity.

Parameters:
dir_namestr (default=None)

The name of the directory within wikirepo.data.

ents_dictwd_utils.EntitiesDict (default=None)

A dictionary with keys being Wikidata QIDs and values being their entities.

qidsstr or list (contains strs) (default=None)

Wikidata QIDs for locations.

pidstr (default=None)

The Wikidata property that is being queried.

sub_pidstr (default=None)

The Wikidata property that subsets time values.

timespantwo element tuple or listcontains datetime.date or tuple (default=None: (date.today(), date.today()))

A tuple or list that defines the start and end dates to be queried.

Note 1: if True, then the full timespan from 1-1-1 to the current day will be queried.

Note 2: passing a single entry will query for that date only.

intervalstr (default=None)

The time interval over which queries will be made.

Note 1: see data.time_utils for options.

Note 2: if None, then only the most recent data will be queried.

ignore_charstr (default=’’, no character to ignore)

Characters in the output that should be ignored.

spanbool (default=False)

Whether to check for P580 ‘start time’ and P582 ‘end time’ to create spans.

Returns:
t_prop_dictdict

A dictionary of Wikidata properties indexed by their time.

Notes

Used to assign property values to a single column (values cannot have the same time value).

wikirepo.data.wd_utils.t_to_prop_val_dict_dict(dir_name=None, ents_dict=None, qids=None, pid=None, sub_pid=None, interval=None, timespan=None, ignore_char='', span=False)[source]

Gets a dictionary of dictionaries of multiple property values that are indexed by time(s) from a locational entity.

Parameters:
dir_namestr (default=None)

The name of the directory within wikirepo.data.

ents_dictwd_utils.EntitiesDict (default=None)

A dictionary with keys being Wikidata QIDs and values being their entities.

qidsstr or list (contains strs) (default=None)

Wikidata QIDs for locations.

pidstr (default=None)

The Wikidata property that is being queried.

sub_pidstr (default=None)

The Wikidata property that subsets time values.

timespantwo element tuple or listcontains datetime.date or tuple (default=None: (date.today(), date.today()))

A tuple or list that defines the start and end dates to be queried.

Note 1: if True, then the full timespan from 1-1-1 to the current day will be queried.

Note 2: passing a single entry will query for that date only.

intervalstr (default=None)

The time interval over which queries will be made.

Note 1: see data.time_utils for options.

Note 2: if None, then only the most recent data will be queried.

ignore_charstr (default=’’, no character to ignore)

Characters in the output that should be ignored.

spanbool (default=False)

Whether to check for P580 ‘start time’ and P582 ‘end time’ to create spans.

Returns:
t_prop_dictdict

A dictionary of Wikidata properties indexed by their time.

Notes

Used to assign property values to separate columns (values can have the same time value)

class wikirepo.data.wd_utils.EntitiesDict(*args, **kwargs)[source]

A dictionary for storing WikiData entities.

Keywords are QIDs, and values are QID entities.