# Wikidata Revision History Dataset ## Dataset Overview The "Wikidata Revision History Dataset" is a sample created from the Wikidata edit history. It contains details about each edit, including the task performed, the user group responsible, the timestamp, and the specific item and class to which it pertains. By analyzing these details, we can gain insights into the behavior of user groups and their editing patterns. ### Dataset Contents - `WD_Revision_History_Dataset.csv` - `WD_items_classes.csv` - `LICENSE`: Creative Commons License file. ### Data Source The contents of this dataset were collected from Wikidata through the Wikidata Query Service {https://www.mediawiki.org/wiki/Wikidata\_Query\_Service} and the seven tables of the MediaWiki database (user, user group, user former group, actor, revision, userindex and page) using Toolforge {https://tools.wmflabs.org}. ### Data Dictionary 1. WD_revision_history | Column Name | Data Type | Description | |------------ |---------- |------------ | | rev_id | Integer | Revision identifier | | item_page_id | Integer | The id of the item page of the revision | | rev_text_id | Integer | The id of the revision comment | | comment | String | The text describing the revision or edit performed | | rev_type | String | The type of comment, structured or unstructured | | edit_summary | String | The extracted portion of the comment column between /* and */| | edit_type | String | The actual edit performed based on the meaning of the edit_summary | | user_name | String | Name of the user/ editor performing this edit | | user_group | String | The group this user belongs to | | rev_timestamp | Date | The data and time of the revision | 2. WD_items_classes | | Column Name | Data Type | Description | |------------ |---------- |------------ | | item_id | String | Wikidata item identifier | | item_class | String | The class of the item | | item_page_id | Integer | The id of the item page | | number_of_revs | Integer | Number of revisions in item | | maturity_level | String | Maturity level of the item based on the number of revisions: (1-10) inception, (11-100) creation, (101-1000) growth, (1001- infinity) maturity | ### License This dataset is released under the [CC BY-NC-SA 4.0.](LICENSE). ### Contact Information For any questions or feedback, please contact dataset maintainer: - Name: Mariam Farda-Sarbas - Email: mariam.fs@fu-berlin.de