[GH-ISSUE #796] Player ratings wrong scraped in SoFIFA #171

Closed
opened 2026-03-02 15:56:22 +03:00 by kerem · 4 comments
Owner

Originally created by @miguelperosanz on GitHub (Jan 23, 2025).
Original GitHub issue: https://github.com/probberechts/soccerdata/issues/796

Describe the bug
Player ratings wrong in SoFIFA. Only "overallrating", "potential" and "crossing" are correct. It repeats the "crossing" rating for the rest of the variables.

Affected scrapers
This affects the following scrapers:

  • ClubElo
  • ESPN
  • FBref
  • FiveThirtyEight
  • FotMob
  • Match History
  • SoFIFA
  • Understat
  • WhoScored

Code example

import soccerdata as sd
sofifa = sd.SoFIFA(leagues="ENG-Premier League", versions="latest", no_cache=True)
player_ratings = sofifa.read_player_ratings(team=10, player=239085)
player_ratings

Error message

No specific error message

Additional context

Webscraped data incorrect. The code example corresponds to Erling Haaland and it returns the following results:

  • Overallrating = 91 (CORRECT)
  • Potential = 91 (CORRECT)
  • Crossing = 58 (CORRECT)
  • Headingaccuracy = 58 (INCORRECT)
  • ... Rest of variables = they're all 58 (incorrect)

This pattern of repeating the "Crossing" rating is happening for every player.

Contributor Action Plan

  • I can fix this issue and will submit a pull request.
  • I’m unsure how to fix this, but I'm willing to work on it with guidance.
  • I’m not able to fix this issue.
Originally created by @miguelperosanz on GitHub (Jan 23, 2025). Original GitHub issue: https://github.com/probberechts/soccerdata/issues/796 **Describe the bug** Player ratings wrong in SoFIFA. Only "overallrating", "potential" and "crossing" are correct. It repeats the "crossing" rating for the rest of the variables. **Affected scrapers** This affects the following scrapers: - [ ] ClubElo - [ ] ESPN - [ ] FBref - [ ] FiveThirtyEight - [ ] FotMob - [ ] Match History - [x] SoFIFA - [ ] Understat - [ ] WhoScored **Code example** ```python import soccerdata as sd sofifa = sd.SoFIFA(leagues="ENG-Premier League", versions="latest", no_cache=True) player_ratings = sofifa.read_player_ratings(team=10, player=239085) player_ratings ``` **Error message** ``` No specific error message ``` **Additional context** Webscraped data incorrect. The code example corresponds to Erling Haaland and it returns the following results: - Overallrating = 91 (CORRECT) - Potential = 91 (CORRECT) - Crossing = 58 (CORRECT) - Headingaccuracy = 58 (INCORRECT) - ... Rest of variables = they're all 58 (incorrect) This pattern of repeating the "Crossing" rating is happening for every player. **Contributor Action Plan** - [ ] I can fix this issue and will submit a pull request. - [ ] I’m unsure how to fix this, but I'm willing to work on it with guidance. - [x] I’m not able to fix this issue.
kerem 2026-03-02 15:56:22 +03:00
Author
Owner

@probberechts commented on GitHub (Jan 23, 2025):

Ah, seems I was too quick...

The issue is the "//" before "em" in this xpath selector:

github.com/probberechts/soccerdata@d31c9a2f75/soccerdata/sofifa.py (L479-L481)

Based on some quick tests it has to be a combination of:

  • f"//p[.//text()[contains(.,'{s}')]]/span/em"
  • f"//div[contains(.,'{s}')]]/em"
  • f"//li[not(self::script)][.//text()[contains(.,'{s}')]]/em" (not sure if the li tag is still used somewhere)

where s is the name of the statistic.

If someone has time to debug this properly, please create a PR.

<!-- gh-comment-id:2611091330 --> @probberechts commented on GitHub (Jan 23, 2025): Ah, seems I was too quick... The issue is the "//" before "em" in this xpath selector: https://github.com/probberechts/soccerdata/blob/d31c9a2f757a35c83e85996d8feceb429b6b8fb7/soccerdata/sofifa.py#L479-L481 Based on some quick tests it has to be a combination of: - `f"//p[.//text()[contains(.,'{s}')]]/span/em"` - `f"//div[contains(.,'{s}')]]/em"` - `f"//li[not(self::script)][.//text()[contains(.,'{s}')]]/em"` (not sure if the li tag is still used somewhere) where `s` is the name of the statistic. If someone has time to debug this properly, please create a PR.
Author
Owner

@miguelperosanz commented on GitHub (Feb 8, 2025):

Hi the bug is still there and I am not able to fix it, any ideas?

<!-- gh-comment-id:2644364645 --> @miguelperosanz commented on GitHub (Feb 8, 2025): Hi the bug is still there and I am not able to fix it, any ideas?
Author
Owner

@probberechts commented on GitHub (Feb 9, 2025):

Should be fixed in v1.8.7, thanks to @franciscofguerreiro 🎉

<!-- gh-comment-id:2646570786 --> @probberechts commented on GitHub (Feb 9, 2025): Should be fixed in v1.8.7, thanks to @franciscofguerreiro 🎉
Author
Owner

@miguelperosanz commented on GitHub (Feb 9, 2025):

It works! Thanks a lot!

<!-- gh-comment-id:2646630430 --> @miguelperosanz commented on GitHub (Feb 9, 2025): It works! Thanks a lot!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/soccerdata#171
No description provided.