mirror of
https://github.com/probberechts/soccerdata.git
synced 2026-04-25 18:15:58 +03:00
[GH-ISSUE #304] [FBref] Handle canceled / forfeited games #58
Labels
No labels
ESPN
FBref
FotMob
MatchHistory
SoFIFA
Sofascore
WhoScored
WhoScored
bug
build
common
dependencies
discussion
documentation
duplicate
enhancement
good first issue
invalid
performance
pull-request
question
question
removal
understat
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/soccerdata#58
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @probberechts on GitHub (Jul 23, 2023).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/304
As pointed out in #286 by @lorenzodb1, running the
fbref.read_player_match_statsfunction fails when the list of seasons to be scraped contains canceled or forfeited games. Examples of such games are Lyon vs. Reims Match on Friday March 13, 2020 and Hellas Verona vs. Roma Match on Saturday September 19, 2020. The main issue is that the summary player stats table for these games contains different columns than the corresponding table for completed games (e.g., it adds a "PkWon" column and misses all non-performance stats).I see two options, currently preferring the first one:
@lorenzodb1 commented on GitHub (Jul 24, 2023):
Thank you for creating this issue. I prefer the second option, as it maintains the integrity of the data scraped (i.e., no data will be missing).
@probberechts commented on GitHub (Jul 24, 2023):
It will require a much more complicated implementation for only a limited number of games. But fine with me if you can implement it.
Also, I am wondering whether stats collected in forfeited games count toward a team/player's season totals. Maybe that should decide how we address this?
@lorenzodb1 commented on GitHub (Jul 24, 2023):
That's what https://github.com/probberechts/soccerdata/pull/286 did. Maybe we can pick up that PR again and improve it?
iirc they do, but I'm not 100% sure.
@probberechts commented on GitHub (Jul 25, 2023):
I've added a unit test case in
5a4c724to illustrate the intended behavior of the_concatfunction. The PR that you previously created broke this.But you'll have to modify the
_concatfunction indeed. What you'll have to do is:@lorenzodb1 commented on GitHub (Jul 25, 2023):
So is
_concatintended to map the column("", "90s")to("Performance", "90s")?@probberechts commented on GitHub (Jul 25, 2023):
Yes, that's indeed one of the most common inconsistencies that it fixes.