(
)\ ) ( ( (
(()/( ( ( ( ) )\ ) )\ ) ( )\ ) ( (
/())( )\ ))\ ( ( ))\ ( /( ( (()/( (()/( ))\ ( (()/( ))\ )(
()) )(()/(())\ ) )\ /(() )()) )\ ) (()) /()) /(())\ ) (())((|()\
/ |((||)) (/( ((|)) (() (/( | | (_)) ()) (/( | ()) (()
\ Y || / -) ' )) |/ -) / ` | ' )) | | (_ / -_) ' \)) _
/ -)| '|
||||_||||_|_| \,||||_,| __||||_,__|||
Science and Gender
The goal of this application is to predict gender, with a reasonable margin of error, based only on author names found in articles published by PLoS accessible through http://api.plos.org/
Resources
- Facebook Name and Gender Research data:
- http://sites.google.com/site/facebooknamelist/namelist
- List of common gender-neutral names:
- http://evan.nixsyspaus.org/names/
- data: http://evan.nixsyspaus.org/names/ordered-names.txt
- "Baby Name Guesser" gives HTML web page back for name query with gender probability value and name popularity
- http://www.gpeters.com/names/baby-names.php
- Wolfram Alpha returns name information for known names
- e.g. http://www.wolframalpha.com/input/?i=Mary
- For single name (best to minimize errors) API query: http://www.wolframalpha.com/input/?i=name%2C+ELIZABETH
- The Wolfram Alpha API only allows individual applications 2,000 queries per month >.< meaning, useless for us.
- US Census Data (DUH)
- Names by rank with female/male breakdown: http://www.census.gov/genealogy/names/names_files.html
- Male first data: http://www.census.gov/genealogy/names/dist.male.first
- Female first data: http://www.census.gov/genealogy/names/dist.female.first
- Also using Wikipedia lists of names, those organized by category
- http://en.wikipedia.org/wiki/Category:Given_names_by_gender
- Also, just realized there is a list of gender-neutral names: http://en.wikipedia.org/wiki/Category:Unisex_given_names
- Didn't know about the Wikipedia API: http://en.wikipedia.org/w/api.php until late, but successfully scraped the name data! You can find it in our data/ directory.