Conversation

get conversation tree from question

Each #d status with a reply_count > 0 is the start of a diagnostic conversation tree.

reply_count

  • Not available in the standard API
  • Available from TweetScraper
  • getstatus/replycount.py

Standard (free of charge) Twitter API doesn't allow to get all responses to a specific status. Method to route around this limitation:

  1. Use TweetScraper
  2. Search all replies to the user who posted the question status after a certain date and time
  3. We need to filter those answers with "in_reply_to_status_id" but this field is not present in the json object obtained with TweetScraper...
  4. Get the full Twitter object with the standard API
  5. filter all collected answers with status["in_reply_to_status_id"] == status_id
  6. if true add to the corpus database
  7. repeat the process recursively for each answer with not null reply_count

Original tweet is 1st doc(s)toctoc tweet posted on 2012-06-06: https://twitter.com/DrKoibo/status/210290960695959553 Request is "to:DrKoibo since:2012-06-06"

# using pipenv
pipenv run scrapy crawl TweetScraper -a query="to:DrKoibo since:2012-06-06"

returns 8111 status (as of 2018-03-29)