Neural Network based on 'The Lounge'

So I had this neat idea of making a neural network off all the posts made in the lounge but I need a couple of questions answered first about scraping data.

One big question I have first is:
Is there a way to grab raw post data? If not, what template does this forum use? If I know the template I can clean the data much easier than figuring out the source from scratch.

The next step would be feeding all that raw data into a text file with usernames attached to each subsequent post.

I can make the neural network pretty easily after that.

Can´t answer your questions, but have one very important question to you:

7 Likes

It would be cool to see what it spits out.

The lounge has a ton of textual data spanning years of operation with consistently active members.

That’s like a gold mine.

This should do it

https://forum.level1techs.com/t/the-lounge-2018-06-june-philosophically-alcoholic-edition/127986.json

just add .json to any URL

4 Likes

Wait you’re gonna scan all the past threads?

I don’t know if I would recommend that. <_<

5 Likes

Can you be more specific? A neural network doing what?

fyi, you need to be authenticated.

Learning the order of characters in a post. Pretty simple but there are some other things that can make the output look more like english. I have to find all the papers I’ve read on the subject.

@Eden
I can see the json file just fine :slight_smile:

Because you’re logged in, I don’t think it appears if your not.

2 Likes

I’ll keep that in mind.

This is a neat idea, but I’m still not sure what you are trying to do. Classify the posts by author? Make the AI participate in conversations? “Neural Net” is a very generic term :slight_smile:

1 Like

Predict the likely hood of the one character appearing after another.

Have a look at:
http://nlp.fast.ai/category/classification.html

Microsoft Tay 2.0 incoming.

10 Likes

You mean you’re making cyber replicants to replace us! Fuck you globalist!

5 Likes

@Eden

So I guess only 16 posts are loaded at a time. Would it be possible to grab them all at the same time or would it have to be iterative?

I’m not terribly familiar with json but I can kinda see what’s going on.


@noenken @Braysive Have fun being replaced: https://static1.fjcdn.com/thumbnails/comments/Oh+come+on+levvy+who+if+not+you+should+know+_dbc775a632f48de4552ef0cc17bf23b7.png

Iterative as far as im aware but that’s not to hard to do it gives your post ids etc.

It should give you 20 at a time with content and an index of all post ids

If you want to visually look at it open it in firefox

Yeah that’s what I’m doing.

My weekend is now gone. I’m so excited :smiley:

Should be good, just wanted to confirm. I’ll do some reading.

1 Like

Theres a bunch of markov bot and speech chain nets up on github already if you just want a quick result. Scrapy can help with getting the content

I don’t see how this can turn into anything positive

1 Like