To programmatically access posts and comments from a Reddit subreddit, you can use Python's PRAW library.
You must login to Reddit, then register your application to get a client ID and secret.
Once you're set is it simply a matter of:
import praw
reddit = praw.Reddit(
client_id="CLIENT_ID",
client_secret="CLIENT_SECRET",
user_agent="WHATEVER"
)
submissions = reddit.subreddit("SUBREDDIT_NAME").new(limit=POST_LIMIT)
for submission in submissions:
submission.comments.replace_more(limit=None)
for comment in submission.comments:
#do something
Some things to note:
A post is represented by a submission object, some interesting attributes are:
- created_utc: the UTC epoch when this post was created
- author
- score: all posts start with score 1 and it changes based on user votes
- title
- selftext: the body of the post
To verify if a post was deleted there is unfortunately not a simple way but in general you can consider this rule to be valid:
submission.author is None or not submission.is_robot_indexable or submission.selftext == "[deleted]" or submission.selftext == "[removed]"
Remember however that the [deleted] and [removed] strings are language dependent, therefore based on your settings you might not get them in English.
A comment is represented by a comment object, its body is in the body attribute.
To verify if a comment was posted by a moderator (or a moderator bot), you can check:
comment.distinguished == "moderator"
No comments:
Post a Comment
With great power comes great responsibility