Attention SPREADSHEET gurus: help needed

benembry · July 14, 2021, 10:20pm

I’m working on a Google Sheet that is importing some pages from Fangraphs, but there is one page that I have not been able to crack.

I’d like to import the Steamer ROS page and I can get the first 50 players, but that’s it. Someone taught me to put “page=1_1000” at the end of the link and that has worked on some pages but not this one.

Here’s my formula:

=importhtml(“2021 Projections » steamerr » Batters | FanGraphs Baseball”)

Does anyone have a theory that they think could solve this issue? Is there anyone who has already solved it for themselves and doesn’t mind letting me in on the secret? Any and all advice is appreciated.

faker · July 14, 2021, 10:53pm

I’m fairly certain you can’t pull those programmatically with importhtml because they use js to pull the next 50. I tried to solve this a year or so back and my solution was to use a web crawler like selenium to pull the csv exports to my local machine. This is not however my expertise so someone else maybe have a more google sheets type answer.

doorbot · July 15, 2021, 3:58am

I’ve mentioned this in the past to others regarding this, but I still think the best way to get this data is from the player pages directly. If players that are not rostered need to be included, a series of calls that filter the data—by position, team, player, etc.—could be a fine proxy for the full data set.

doorbot · July 15, 2021, 4:16am

For reference, simply include the player IDs:

benembry · July 15, 2021, 11:34am

Yes the JavaScript seems to be the impediment. I may have found a solution last night on this page: JSON To Google Sheets: Free & Open-Source Method [Tutorial] | Apipheny

I haven’t tried it yet. Will post an update once I do.

faker · July 15, 2021, 1:25pm

If this doesnt work out I can dig up the selenium script i had written to pull steamer projections and email it to you. If you are familiar with python and have an environment to do so, its not too complicated. A cursory google told me that you can upload local csvs to google sheets programatically using their api, so you could add that to the selenium script and have a fangraphs page → google sheets solution