read

I'm writing a scraper for CURSE.com in C# for a Kerbal Space Program mod manager. The other managers out there were either:

  • Bad at everything
  • Unable to auto-update
  • Written in java

So I'm making my own.

Mod Page

The relevant HTML on the mod page looks like this:

<ul class="authors group">
    <li>Project Manager: <a href="/users/r4m0n">r4m0n</a></li>
    <li>Contributor: <a href="/users/sarbian">sarbian</a>, <a href="/users/Anatid">Anatid</a></li>                   
</ul>
<ul class="details-list">
    <li class="game"><a href="/ksp-mods/kerbal">Kerbal Space Program</a></li>
    <li class="average-downloads">56,163 Monthly Downloads</li>
    <li class="version version-out-of-date">Supports: 0.25</li>
    <li class="downloads">273,770 Total Downloads</li>
    <li class="updated">Updated <abbr class="standard-date" title="Thu, 9 Oct 2014 18:04:41 CDT (UTC-5:00)" data-epoch="1412895881">10/09/2014</abbr></li>
    <li class="updated">Created <abbr class="standard-date" title="Tue, 6 May 2014 13:15:45 CDT (UTC-5:00)" data-epoch="1399400145">05/06/2014</abbr></li>
    <li class="favorited">875 Favorites</li>
    <li class="curseforge"><a href="http://www.curseforge.com/projects/220221/">Project Site</a></li>        
    <li class="comments"><a href="#comments">Comments</a></li>
    <li class="release">Release Type: Release</li>
    <li class="license">License: GNU General Public License version 3 (GPLv3) </li>
    <li class="newest-file">Newest File: MechJeb2-2.4.0.0.zip</li>
</ul>

Scraping it out using my parser:

{  
   "game_name":"Kerbal Space Program",
   "game_id":"kerbal",
   "average_downloads":56163,
   "supports":0.25,
   "downloads":56163,
   "updated":"2014-10-09T23:04:41Z",
   "created":"2014-05-06T18:15:45Z",
   "favorited":875,
   "project_url":"http://www.curseforge.com/projects/220221/",
   "release_type":"Release",
   "license":"GNU General Public License version 3 (GPLv3) ",
   "newest_file":"MechJeb2-2.4.0.0.zip",
   "project_manager":[  
      "r4m0n"
   ],
   "contributor":[  
      "sarbian",
      "Anatid"
   ]
}

After this I can now fetch the mod from Curse whenever I need to.

Source code available as always on GitHub.

Blog Logo

Christian Stewart


Published

comments powered by Disqus
Image

Christian Stewart

Also known as Quantum and Paralin.

Back to Overview