mirror of
https://github.com/probberechts/soccerdata.git
synced 2026-04-25 18:15:58 +03:00
[GH-ISSUE #597] No unique player identifier #108
Labels
No labels
ESPN
FBref
FotMob
MatchHistory
SoFIFA
Sofascore
WhoScored
WhoScored
bug
build
common
dependencies
discussion
documentation
duplicate
enhancement
good first issue
invalid
performance
pull-request
question
question
removal
understat
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/soccerdata#108
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ProjectPear100 on GitHub (May 28, 2024).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/597
Is there any way we can join player data from multiple sources (FBref, WhoScored, SoFIFA) using an identifier or unique key of some sort? I have gone through the documentation and was unable to find any method to identify a unique player. Joining on names wont work since multiple players could have same names and also they are spelled differently across sources.
@probberechts commented on GitHub (May 28, 2024):
You'll have to programatically and/or manually map the player ids between different data sources. There are no shortcuts and soccerdata does not provide any support for doing this. In the future, I consider adding support for replacing player names / IDs by a standardized name / ID given a mapping (similarly to what is already possible for teams using the
config/teamname_replacements.jsonfile). But I consider creating the mapping itself out of scope for this package.For some tips on how to match player IDs across multiple data sources, I can recommend this blog post: https://unravelsports.github.io/2022/07/11/player-id-matching-system.html