[GH-ISSUE #597] No unique player identifier #108

Closed
opened 2026-03-02 15:55:51 +03:00 by kerem · 1 comment
Owner

Originally created by @ProjectPear100 on GitHub (May 28, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/597

Is there any way we can join player data from multiple sources (FBref, WhoScored, SoFIFA) using an identifier or unique key of some sort? I have gone through the documentation and was unable to find any method to identify a unique player. Joining on names wont work since multiple players could have same names and also they are spelled differently across sources.

Originally created by @ProjectPear100 on GitHub (May 28, 2024). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/597 Is there any way we can join player data from multiple sources (FBref, WhoScored, SoFIFA) using an identifier or unique key of some sort? I have gone through the documentation and was unable to find any method to identify a unique player. Joining on names wont work since multiple players could have same names and also they are spelled differently across sources.
kerem 2026-03-02 15:55:51 +03:00
  • closed this issue
  • added the
    question
    label
Author
Owner

@probberechts commented on GitHub (May 28, 2024):

You'll have to programatically and/or manually map the player ids between different data sources. There are no shortcuts and soccerdata does not provide any support for doing this. In the future, I consider adding support for replacing player names / IDs by a standardized name / ID given a mapping (similarly to what is already possible for teams using the config/teamname_replacements.json file). But I consider creating the mapping itself out of scope for this package.

For some tips on how to match player IDs across multiple data sources, I can recommend this blog post: https://unravelsports.github.io/2022/07/11/player-id-matching-system.html

<!-- gh-comment-id:2135864227 --> @probberechts commented on GitHub (May 28, 2024): You'll have to programatically and/or manually map the player ids between different data sources. There are no shortcuts and soccerdata does not provide any support for doing this. In the future, I consider adding support for replacing player names / IDs by a standardized name / ID *given* a mapping (similarly to what is already possible for teams using the `config/teamname_replacements.json` file). But I consider creating the mapping itself out of scope for this package. For some tips on how to match player IDs across multiple data sources, I can recommend this blog post: https://unravelsports.github.io/2022/07/11/player-id-matching-system.html
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#108
No description provided.