Steve's eSports Scraper
Solo Developer
MongoDB, Web Scraping, NodeJS
18 Hours
Steve's eSports Scraper

My scraper app finds eSports stories from, stores the link URL, headline, and image URL in a MongoDB. The index.html template pulls in all of the available stories (newest entries first) and sorts them into their correct locations on the page.

The first entry will take the featured area; the next 6 will take the mid-size locations, and all remaining entries will be listed as headline only. I cycled through the entries from the newest first using a reverse for loop; the counting iterator begins as the array length and decreases by 1 after each pass.

The scraping button will search the site for all news stories; at this point, it does not error check for any duplicate entries. When a user clicks a story, it brings them to a comment page, where stories can be marked as favorites and commented on. Another click links to the external site to read the full story instead of just the headline. Specific user authentication was outside the scope of this app.

Full Stack Node Application
Cheerio Web Scraping
Express Server Integration
Custom API Endpoints
MVC Architecture
Deployed Project
Code Repository
Deprecation Notice

This project has not been maintained since 2018 and might not be functional and/or secure.

Next Steps

User Authentication was outside the scope of this app, as the focus was to scrape to a MongoDB. A commenting system is not incredibly useful without a sense of community; I would like to add more users and more than 1 comment. I would also like to include the date of the article, as well as including stories from multiple sources. Now that I’ve seen the finished product, I’d also find a lot of value in a blurb about the article, but that was also outside the scope of this scraping activity.

As a final note, I ran out of time for this project, which is why several of the wish list items above were excluded. I used an API controller for all of the database calls, but the server loads direct access to the HTML files instead of serving them from a route call. I would update the Express HTML routing as well.