data science

15 Minutes With: Chris Baglieri

Gathering, cleaning, manipulating, and assessing data is a complex (and expensive) job – especially if the data takes a wide variety of forms, and comes from many different sources. So why should companies invest in that work?

That’s not a Data Lake, THIS is a Data Lake – IoT on AWS – A Philly Cloud Computing Event

This talk will review two common use cases for the use of captured metric data: 1) Real-time analysis, visualization, and quality assurance, and 2) Ad-hoc analysis. The most common open source streaming options will be mentioned, however this talk be concerned with Apache Flink specifically. A brief discussion of Apache Beam will also be included in the context of the larger discussion of a unified data processing model.

TechCast #110 – Chris Baglieri

In this episode, host Ken Rimple talks to Chris Baglieri of BlackFynn on the company’s work in health research-related data science. He discusses how he and his team aids researchers who are attacking Parkinson’s disease and other disorders. They’ve been doing the deep genetic research that has benefitted cancer research over the past two decades. … Read More

Pink Noise in Neural Nets: A Brief Experiment

Disclaimer: Some basic exposure to machine learning is assumed.   Neural nets are on the rise, now that computing power and parallel data processing capabilities have reached the levels that allow them to shine. Recurrent neural nets, the more sophisticated kind that possess time dynamics, have achieved spectacular results in certain areas. Overfitting, however, has … Read More

The O'Reilly AI Conference

I recently attended the O’Reilly AI Conference in New York where artificial intelligence practitioners showcased the impressive strides they’ve made so far in using AI for real-world applications

Data I/O 2013 – All the Data and Still Not Enough??? – Claudia Perlich

Predictive modeling is one of the figureheads of big data. Machine Learning Theory asserts that the more data the better, and empirical observations suggest that the more granular data, the better the performance (provided you have modern algorithms and big data) but the paradox of predictive modeling is that when you need models the most, even all the data is not enough.