27/03/2021

[Python] Access Reddit posts and comments with PRAW

To programmatically access posts and comments from a Reddit subreddit, you can use Python's PRAW library.


You must login to Reddit, then register your application to get a client ID and secret.


Once you're set is it simply a matter of:


 import praw  
   
 reddit = praw.Reddit(  
   client_id="CLIENT_ID",  
   client_secret="CLIENT_SECRET",  
   user_agent="WHATEVER"  
 )  
   
 submissions = reddit.subreddit("SUBREDDIT_NAME").new(limit=POST_LIMIT)  
   
 for submission in submissions:  
  submission.comments.replace_more(limit=None)  
    
  for comment in submission.comments:  
   #do something  


Some things to note:

A post is represented by a submission object, some interesting attributes are:

  • created_utc: the UTC epoch when this post was created
  • author
  • score: all posts start with score 1 and it changes based on user votes
  • title
  • selftext: the body of the post

 

To verify if a post was deleted there is unfortunately not a simple way but in general you can consider this rule to be valid:

submission.author is None or not submission.is_robot_indexable or submission.selftext == "[deleted]" or submission.selftext == "[removed]"

Remember however that the [deleted] and [removed] strings are language dependent, therefore based on your settings you might not get them in English.


A comment is represented by a comment object, its body is in the body attribute.

To verify if a comment was posted by a moderator (or a moderator bot), you can check:

comment.distinguished == "moderator"

No comments:

Post a Comment

With great power comes great responsibility