Computed·Blg

Entries tagged as github

Tuesday, August 30. 2011

Automatic spelling corrections on Github

Via Holden's Blog

By Holden Karau

-----

English has never been one of my strong points (as is fairly obvious by reading my blog), so my latest side project might surprise you a bit. Inspired by the results of tarsnap’s bug bounty and the first pull request received for a new project(slashem - a type safe rogue like DSL for querying solr in scala) I decided to write a bot for github to fix spelling mistakes.

The code its self is very simple (albeit not very good, it was written after I got back from clubbing @ JWZ’s club [DNA lounge]). There is something about a lack of sleep which makes perl code and regexs seem like a good idea. If despite the previous warnings you still want to look at the codehttps://github.com/holdenk/holdensmagicalunicorn is the place to go. It works by doing a github search for all the README files in markdown format and then running a limited spell checker on them. Documents with a known misspelled word are flagged and output to a file. Thanks to the wonderful github api the next steps is are easy. It forks the repo and clones it locally, performs the spelling correction, commits, pushes and submits a pull request.

The spelling correction is based on Pod::Spell::CommonMistakes, it works using a very restricted set of misspelled words to corrections.

Writing a “future directions” sections always seems like such a cliche, but here it is anyways. The code as it stands is really simple. For example it only handles one repo of a given name, and the dictionary is small, etc. The next version should probably also try and only submit corrections against the conical repo. Some future plans extending the dictionary. In the longer term I think it would be awesome to attempt detect really simple bugs in actual code (things like memcpy(dest,0,0)).

You can follow the bot on twitter holdensunicorn .

Comments, suggestions, and patches always appreciated. -holdenkarau (although I’m going to be AFK at burning man for awhile, you can find me @ 6:30 & D)

Posted by Christian Babski in Software at 18:10

Defined tags for this entry: github, robot, software

Related entries by tags:

2178 hits

(Page 1 of 1, totaling 1 entries)

Quicksearch

Popular Entries

Show tagged entries

3d printing
ai
android
cloud
data visualisation
google
hardware
innovation&society
interface
mobile
network
os
privacy
programming
security
software
tablet
technology
web
wifi

Syndicate This Blog

Calendar

Blog Administration

Open login screen

Entries tagged as github

Entries tagged as github

Tuesday, August 30. 2011

Automatic spelling corrections on Github

Quicksearch

Popular Entries

Categories

Show tagged entries

Syndicate This Blog

Calendar

Blog Administration