Getting Started with Python for Data Scientists
With the R Users DC Meetup broadening its topic base to include other statistical programming tools, it seemed only reasonable to write a meta post highlighting some of the best Python tutorials and...
View ArticlePython vs R vs SPSS … Can’t All Programmers Just Get Along?
Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl...
View ArticleUsing Data to Create Viral Content. [INFOGRAPHIC]
Netflix recently used their own data to drive the creation of the hit series ‘House of Cards’. A similar approach can be applied to other forms of media to create content that is highly likely to...
View ArticlePyData and More Tools for Getting Started with Python for Data Scientists
It would turn out that people are very interested in learning more about python and our last post, “Getting Started with Python for Data Scientists,” generated a ton of comments and recommendations....
View ArticleNatural Language Processing and Big Data: Using NLTK and Hadoop – Talk Overview
My previous startup, Unbound Concepts, created a machine learning algorithm that determined the textual complexity (e.g. reading level) of children’s literature. Our approach started as a natural...
View ArticleBig Data and Natural Language Processing – Part 1
We hope you enjoyed the introduction to this series, part 1 is below. “The science that has been developed around the facts of language passed through three stages before finding its true and unique...
View ArticleThe “Foo” of Big Data – Part 2
Welcome to Part 2 of this epic Big Data and Natural Language Processing perspective series. Here is the intro and part one if you missed any of them. Domain knowledge is incredibly important,...
View ArticlePython’s Natural Language Took Kit (NLTK) and Hadoop – Part 3
Welcome back to part 3 of Ben’s talk about Big Data and Natural Language Processing. (Click through to see the intro, part 1, and part 2). We chose NLTK (Natural Language Toolkit) particularly because...
View ArticleHadoop for Preprocessing Language – Part 4
We are glad that you have stuck around for this long and, just in case you have missed any parts, click through to the introduction, part 1, part 2, and part 3. You might ask me, doesn’t Hadoop do text...
View ArticleBeyond Preprocessing – Weakly Inferred Meanings – Part 5
Congrats! This is the final post in our 6 part series! Just in case you have missed any parts, click through to the introduction, part 1, part 2, part 3, and part 4. After you have treebanks, then...
View ArticlePyAutoDiff: automatic differentiation for NumPy
We are excited to have a guest post discussing a new tool that is freely available for the Python community. Welcome, Jeremiah Lowin, the Chief Scientist of the Lowin Data Company, to the growing pool...
View ArticleStepping up to Big Data with R and Python: A Mind Map of All the Packages You...
On May 8, we kicked off the transformation of R Users DC to Statistical Programming DC (SPDC) with a meetup at iStrategyLabs in Dupont Circle. The meetup, titled “Stepping up to big data with R and...
View ArticleData Visualization: From Excel to ???
Microsoft Excel Wizard So you’re an excel wizard, you make the best graphs and charts Microsoft’s classic product has to offer, and you expertly integrate them into your business operations. Lately...
View ArticleData Community DC Video Series Kicks Off: Dr. Jesse English Talks NLP and...
We are excited to announce the first in a new series of posts and a brand new initiative: Data Community DC Videos! We are going to film and publish online videos (and separate audio, resources...
View ArticleA Julia Meta Tutorial
If you are thinking about taking Julia, the hot new mathematical, statistical, and data-oriented programming language, for a test drive, you might need a little bit of help. In this blog we round up...
View ArticlePython for Data Analysis: The Landscape of Tutorials
Python has been one of the premier general scripting languages, and a major web development language. Numerical and data analysis and scientific programming developed through the packages Numpy and...
View ArticleData Science MD July Recap: Python and R Meetup
For July’s meetup, Data Science MD was honored to have Jonathan Street of NIH and Brian Godsey of RedOwl Analytics come discuss using Python and R for data analysis. Jonathan started off by describing...
View ArticleHadoop for Data Science: A Data Science MD Recap
On October 9th, Data Science MD welcomed Dr. Donald Miner as its speaker to talk about doing data science work and how the hadoop framework can help. To start the presentation, Don was very clear...
View ArticlePyDataNYC 2013 – A Summary of a Fantastic Conference for Data Community
PyData NYC 2013 was a two-day conference this past weekend (Saturday and Sunday, 11/9 and 11/11) with a day of tutorials on Friday. Saturday and Sunday featured keynotes each morning and three tracks...
View ArticleA Tutorial for Deploying a Django Application that Uses Numpy and Scipy to...
by Sean Patrick Murphy Introduction This longer-than-initially planned article walks one through the process of deploying a non-standard Django application on a virtual instance provisioned not from...
View Article