Twitter repository for #pdftribute

Update #1: Now with ~30,000 tweets!
Update #2: Now with ~40,000 tweets!
Update #3: Now with ~50,000 tweets!

You may be following pdftribute.net, which is a great initiative to scrape the #pdftribute links as they stream. I am posting a more comprehensive repository (to be updated regularly) which people can download to start doing their own analyses. (Edit: this one has about 50,000 tweets, whereas I believe pdftribute.net is based off around 10,000, having started the streaming later? Could be wrong.)

Early on, I was watching TweetReach to see how many tweets would show up in an API search. When it hit 1,500 tweets, I started tracking #pdftribute as it streamed as a public service (“twitter dump.csv” in the zip file) and also did an API search for those first 1,500 tweets (“early twitter dump.csv”). In this way, I think I’ve captured almost all the tweets, though may have missed a few of the early ones.

The tracking is still going on. Will continue to update.

If you can extract the links to files from the repository, please email me so I can upload the files themselves!

Thank you all very much for your support!

Posted in Uncategorized
1 Comment » for Twitter repository for #pdftribute
  1. Hi, i started a scraper, too, collecting only unique urls at pdftribute.loc-com.de. Data is available in JSON at pdftribute.loc-com.de/json and free to use!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>