Data visualisation - Computed·Blg

Tuesday, December 02. 2014

Big Data Digest: Rise of the think-bots

-----

It turns out that a vital missing ingredient in the long-sought after goal of getting machines to think like humans—artificial intelligence—has been lots and lots of data.

Last week, at the O’Reilly Strata + Hadoop World Conference in New York, Salesforce.com’s head of artificial intelligence, Beau Cronin, asserted that AI has gotten a shot in the arm from the big data movement. “Deep learning on its own, done in academia, doesn’t have the [same] impact as when it is brought into Google, scaled and built into a new product,” Cronin said.

In the week since Cronin’s talk, we saw a whole slew of companies—startups mostly—come out of stealth mode to offer new ways of analyzing big data, using machine learning, natural language recognition and other AI techniques that those researchers have been developing for decades.

One such startup, Cognitive Scale, applies IBM Watson-like learning capabilities to draw insights from vast amount of what it calls “dark data,” buried either in the Web—Yelp reviews, online photos, discussion forums—or on the company network, such as employee and payroll files, noted KM World.

Cognitive Scale offers a set of APIs (application programming interfaces) that businesses can use to tap into cognitive-based capabilities designed to improve search and analysis jobs running on cloud services such as IBM’s Bluemix, detailed the Programmable Web.

Cognitive Scale was founded by Matt Sanchez, who headed up IBM’s Watson Labs, helping bring to market some of the first e-commerce applications based on the Jeopardy-winning Watson technology, pointed out CRN.

Sanchez, now chief technology officer for Cognitive Scale, is not the only Watson alumnus who has gone on to commercialize cognitive technologies.

Alert reader Gabrielle Sanchez pointed out that another Watson ex-alum, engineer Pete Bouchard, recently joined the team of another cognitive computing startup Zintera as the chief innovation office. Sanchez, who studied cognitive computing in college, found a demonstration of the company’s “deep learning” cognitive computing platform to be “pretty impressive.”

AI-based deep learning with big data was certainly on the mind of senior Google executives. This week the company snapped up two Oxford University technology spin-off companies that focus on deep learning, Dark Blue Labs and Vision Factory.

The teams will work on image recognition and natural language understanding, Sharon Gaudin reported in Computerworld.

Sumo Logic has found a way to apply machine learning to large amounts machine data. An update to its analysis platform now allows the software to pinpoint casual relationships within sets of data, Inside Big Data concluded.

A company could, for instance, use the Sumo Logic cloud service to analyze log data to troubleshoot a faulty application, for instance.

While companies such as Splunk have long offered search engines for machine data, Sumo Logic moves that technology a step forward, the company claimed.

“The trouble with search is that you need to know what you are searching for. If you don’t know everything about your data, you can’t by definition, search for it. Machine learning became a fundamental part of how we uncover interesting patterns and anomalies in data,” explained Sumo Logic chief marketing officer Sanjay Sarathy, in an interview.

For instance, the company, which processes about 5 petabytes of customer data each day, can recognize similar queries across different users, and suggest possible queries and dashboards that others with similar setups have found useful.

“Crowd-sourcing intelligence around different infrastructure items is something you can only do as a native cloud service,” Sarathy said.

With Sumo Logic, an e-commerce company could ensure that each transaction conducted on its site takes no longer than three seconds to occur. If the response time is lengthier, then an administrator can pinpoint where the holdup is occurring in the transactional flow.

One existing Sumo Logic customer, fashion retailer Tobi, plans to use the new capabilities to better understand how its customers interact with its website.

One-upping IBM on the name game is DataRPM, which crowned its own big data-crunching natural language query engine Sherlock (named after Sherlock Holmes who, after all, employed Watson to execute his menial tasks).

Sherlock is unique in that it can automatically create models of large data sets. Having a model of a data set can help users pull together information more quickly, because the model describes what the data is about, explained DataRPM CEO Sundeep Sanghavi.

DataRPM can analyze a staggeringly wide array of structured, semi-structured and unstructured data sources. “We’ll connect to anything and everything,” Sanghavi said.

The service company can then look for ways that different data sets could be combined to provide more insight.

“We believe that data warehousing is where data goes to die. Big data is not just about size, but also about how many different sources of data you are processing, and how fast you can process that data,” Sanghavi said, in an interview.

For instance, Sherlock can pull together different sources of data and respond with a visualization to a query such as “What was our revenue for last year, based on geography?” The system can even suggest other possible queries as well.

Sherlock has a few advantages over Watson, Sanghavi claimed. The training period is not as long, and the software can be run on-premise, rather than as a cloud service from IBM, for those shops that want to keep their computations in-house. “We’re far more affordable than Watson,” Sanghavi said.

Initially, DataRPM is marketing to the finance, telecommunications, manufacturing, transportation and retail sectors.

One company that certainly does not think data warehousing is going to die is a recently unstealth’ed startup run by Bob Muglia, called Snowflake Computing.

Publicly launched this week, Snowflake aims “to do for the data warehouse what Salesforce did for CRM—transforming the product from a piece of infrastructure that has to be maintained by IT into a service operated entirely by the provider,” wrote Jon Gold at Network World.

Founded in 2012, the company brought in Muglia earlier this year to run the business. Muglia was the head of Microsoft’s server and tools division, and later, head of the software unit at Juniper Networks.

While Snowflake could offer its software as a product, it chooses to do so as a service, noted Timothy Prickett Morgan at Enterprise Tech.

“Sometime either this year or next year, we will see more data being created in the cloud than in an on-premises environment,” Muglia told Morgan. “Because the data is being created in the cloud, analysis of that data in the cloud is very appropriate.”

Posted by Christian Babski in Data visualisation, Software at 11:47

Defined tags for this entry: big data, data mining, data visualisation, software

3829 hits

Friday, October 10. 2014

More Than Meets the Eye: NASA Scientists Listen to Data

Via NASA

-----

Image Credit:

Robert Alexander, NASA/University of Michigan

Robert Alexander spends parts of his day listening to a soft white noise, similar to water falling on the outside of a house during a rainstorm. Every once in a while, he hears an anomalous sound and marks the corresponding time in the audio file. Alexander is listening to the sun’s magnetic field and marking potential areas of interest. After only ten minutes, he has listened to one month’s worth of data.

Alexander is a PhD candidate in design science at the University of Michigan. He is a sonification specialist who trains heliophysicists at NASA’s Goddard Space Flight Center in Greenbelt, Maryland, to pick out subtle differences by listening to satellite data instead of looking at it.

Sonification is the process of displaying any type of data or measurement as sound, such as the beep from a heart rate monitor measuring a person’s pulse, a door bell ringing every time a person enters a room, or, in this case, explosions indicating large events occurring on the sun. In certain cases, scientists can use their ears instead of their eyes to process data more rapidly -- and to detect more details – than through visual analysis. A paper on the effectiveness of sonification in analyzing data from NASA satellites was published in the July issue of Journal of Geophysical Research: Space Physics.

“NASA produces a vast amount of data from its satellites. Exploring such large quantities of data can be difficult,” said Alexander. "Sonification offers a promising supplement to standard visual analysis techniques.”

LISTENING TO SPACE

Alexander's focus is on improving and quantifying the success of these techniques. The team created audio clips from the data and shared them with researchers. While the original data from the Wind satellite was not in audio file format, the satellite records electromagnetic fluctuations that can be converted directly to audio samples. Alexander and his team used custom written computer algorithms to convert those electromagnetic frequencies into sound. Listen to the following multimedia clips to hear the sounds of space.

This clip has three distinct sections: a warble noise leading up to a short knock at slightly higher frequency followed by a quieter segment containing broadband noise that is both rising and hissing. This clip gathered from NASA's Wind satellite on Nov. 20, 2007, contains a reverse shock. This type of event occurs when a fast stream of plasma – that is, the super hot, charged gas that fills space— is followed by a slower one, resulting in a shock wave that travels towards the sun.

This audio clip is the previous clip played backwards. Here, trained listeners will notice the reverse shock event played backwards sounds similar to forward shock event.

This clip contains audified data from the joint European Space Agency (ESA) and NASA Ulysses satellite gathered on October 26, 1995. The participant in Alexander's study was able to detect artificial noise produced from the instrument, which he did not notice in previous visual analysis. Here, the artificial noise can be heard as a drifting tone.

PROCESSING AN OVERWHELMING AMOUNT OF DATA

Alexander's focus is on using clips like these to quantify and improve sonification techniques in order to speed up access to the incredible amounts of data provided by space satellites. For example, he works with space scientist Robert Wicks at NASA Goddard to analyze the high-resolution observations of the sun. Wicks studies the constant stream of particles from our closest star, known as the solar wind – a wind that can cause space weather effects that interfere with human technology near Earth. The team uses data from NASA's Wind satellite. Launched in 1994, Wind orbits a point in between Earth and the sun, constantly observing the temperature, density, speed and the magnetic field of the solar wind as it rushes past.

Wicks analyzes changes in Wind's magnetic field data. Such data not only carries information about the solar wind, but understanding such changes better might help give a forewarning of problematic space weather that can affect satellites near Earth. The Wind satellite also provides an abundance of magnetometer data points, as the satellite measures the magnetic field 11 times per second. Such incredible amounts of information are beneficial -- but only if all the data can be analyzed.

“There is a very long, accurate time series of data, which gives a fantastic view of solar wind changes and what’s going on at small scales,” said Wicks. “There's a rich diversity of physical processes going on, but it is more data than I can easily look through.”

The traditional method of processing the data involves making an educated assertion about where a certain event in the solar wind -- such as subtle wave movements made by hot plasma -- might show up and then visually searching, which can be very time consuming. Instead, Alexander listens to sped up versions of the Wind data and compiles a list of noteworthy regions that scientists like Wicks can return to and further analyze, expediting the process.

In one example, Alexander’s team analyzed data points from the Wind satellite from November 2007, condensing three hours of real-time recording to a three second audio clip. To an untrained ear, the data sounds like a microphone recording on a windy day. When Alexander presented these sounds to a researcher, however, the researcher could identify a distinct chirping at the beginning of the audio clip followed by a percussive event, culminating in a loud boom.

By listening only to the auditory representation of the data, the study’s participant was able to correctly predict what this would look like on a more traditional graph. He correctly deduced that that the chirp would show up as a particular kind of peak on a kind of graph called a spectrogram, a graph that shows different levels of frequencies present in the waves that Wind recorded. The researcher also correctly predicted that the corresponding spectrogram representation of the percussive event would display a steep slope.

CONVERTING DATA INTO SOUND

Alexander translates the data into audio files through a process known as audification, a specific type of sonification that involves directly listening to raw, unedited satellite data. Translating this data into audio can be likened to part of the process of collecting sound from a person singing into a microphone at a recording studio with reel-to-reel tape. When a person sings into a microphone, it detects changes in pressure and converts the pressure signals to changes in magnetic intensity in the form of an electrical signal. The electrical signals are stored on the reel tape. Magnetometers on the Wind satellite measure changes in magnetic field directly creating a similar kind of electrical signal. Alexander writes a computer program to translate this data to an audio file.

“The tones come out of the data naturally. If there is a frequency embedded in the data, then that frequency becomes audible as a sound,” said Alexander.

Listening to data is not new. In a study in 1982, researchers used audification to identify micrometeroids, or small ring particles, hitting the Voyager 2 spacecraft as it traversed Saturn's rings. The impacts were visually obscured in the data but could be easily heard – sounding like intense impulses, almost like a hailstorm.

However, the method is not often used in the science community because it requires a certain level of familiarity with the sounds. For instance, the listener needs to have an understanding of what typical solar wind turbulence sounds like in order to identify atypical events. “It’s about using your ear to pick out subtle differences,” Alexander said.

Alexander initially spent several months with Wicks teaching him how to listen to magnetometer data and highlighting certain elements. But the hard work is paying off as analysis gets faster and easier, leading to new assessments of the data.

“I’ve never listened to the data before,” said Wicks. “It has definitely opened up a different perspective.”

Kasha Patel
NASA’s Goddard Space Flight Center, Greenbelt, Md.

Posted by Christian Babski in Data visualisation, Software at 10:11

3046 hits

Tuesday, July 08. 2014

LaMetric Is A Smart And Hackable Ticker Display

Via TechCrunch

-----

Ever since covering Fliike, a beautifully-designed physical ‘Like’ counter for local businesses, I’ve been thinking about how the idea could be extended, with a fully-programmable, but simple, ticker-style Internet-connected display.

A few products along those lines do already exist, but I’ve yet to find anything that quite matches what I had in mind. That is, until recently, when I was introduced to LaMetric, a smart ticker being developed by UK/Ukraine Internet of Things (IoT) startup Smart Atoms.

Launching its Kickstarter crowdfunding campaign today, the LaMetric is aimed at both consumers and businesses. The idea is you may want to display alerts, notifications and other information from your online “life” via an elegant desktop or wall-mountable and glance-able display. Likewise, businesses that want an Internet-connected ticker, displaying various business information, either publicly for customers or in an office, are also a target market.

LaMetric

The device itself has a retro, 8-bit style desktop clock feel to it, thanks to its ‘blocky’ LED light powered display, which is part of its charm. The display can output one icon and seven numbers, and is scrollable.

But, best of all, the LaMetric is fully programmable via the accompanying app (or “hackable”) and comes with a bunch of off-the-shelf widgets, along with support for RSS and services like IFTTT, Smart Things, Wig Wag, Ninja Blocks, so you can get it talking to other smart devices or web services. Seriously, this thing goes way beyond what I had in mind — try the simulator for yourself — and, for an IoT junkie like me, is just damn cool.

Examples of the kind of things you can track with the device include time, weather, subject and time left till your next meeting, number of new emails and their subject lines, CrossFit timings and fitness goals, number of to-dos for today, stock quotes, and social network notifications.

Or for businesses, this might include Facebook Likes, website visitors, conversions and other metrics, app store rankings, downloads, and revenue.

d867c380a1b07a7784250a4a781ea9db_large

In addition to the display, the device has back and forward buttons so you can rotate widgets (though these can be set to automatically rotate), as well as an enter key for programmed responses, such as accepting a calendar invitation.

There’s also a loudspeaker for audio alerts. The LaMetric is powered by micro-USB and also comes as an optional and more expensive battery-powered version.

Early-bird backers on Kickstarter can pick up the LaMetric for as little as $89 (plus shipping) for the battery-less version, with countless other options and perks, increasing in price.

Posted by Christian Babski in Data visualisation, Hardware, Network at 10:26

Defined tags for this entry: data visualisation, hardware, internet of things, network

2854 hits

Wednesday, April 16. 2014

Plugin Visualizes Your Entire Browser History

Via fastcodesign

-----

Want to know exactly how much of your Internet time is spent fiddling around on Facebook versus doing all-important Googling, online shopping or watching videos? Try Iconic History, a plugin that puts your whole browser history into an exhaustive stream of favicons, the icons that appear next to a website's name on a browser tab.

Created by Carnegie Mellon University computer science student Shan Huang for a class on interactive art and computational design, the plugin pulls your browser history from Google Chrome (which keeps up to four months of data) and visualizes every site you've visited by sorting each individual URL's associated icon chronologically. So it'll show if you've spent an entire day surfing Facebook, or if you've stayed up late into the night visiting Wikipedia.

"Because I spend so much time online every day, doing all sorts of things from working to socializing to just aimless wandering, I thought browser history alone could narrate a significant portion of my life and what was on my mind," Huang writes in her summary of the project. For example, she found she frequently stayed up late online shopping at sites like Urban Outfitters and Macy's.

Each icon is linked to the original URL, so it's easy to go back and see exactly which Urban Outfitters sweater you were coveting or which seven YouTube videos you watched in a row, though these are visualized just as a long list of identical pictures. You can filter the list to show a specific site or time window during the day, in case you really need to know what you've been searching between midnight and 6 a.m. Hopefully you haven't been clearing your most interesting data!

Iconic History - a browser history visualization from Shan on Vimeo.

Try Iconic History for yourself here, or check out Huang's demo.

Posted by Christian Babski in Data visualisation at 10:33

2974 hits

Monday, March 17. 2014

Google Project Tango: 200 phones with 3D sensors for room-scanning

Via Slash Gear

-----

This week the experimental developer-aimed group known as Google ATAP - aka Advanced Technology and Projects (skunkworks) have announced Project Tango. They’ve suggested Project Tango will appear first as a phone with 3D sensors. These 3D sensors will be able to scan and build a map of the room they’re in, opening up a whole world of possibilities.

The device that Project Tango will release first will be just about as limited-edition as they come. Issued in an edition of 200, this device will be sent to developers only. This developer group will be hand-picked by Google’s ATAP - and sign-ups start today. (We’ll be publishing the sign-up link once active.)

parojecttango

Speaking on this skunkworks project this morning was Google user Johnny Lee. Mister Johnny Lee is ATAP’s technical program lead, and he’ll be heading this project for the public, as you’ll see it. This is the same group that brought you Motorola’s digital tattoos, if you’ll remember.

Posted by Christian Babski in Data visualisation, Hardware, Mobile, Software, Technology at 17:28

3082 hits

Monday, October 07. 2013

Man and machine unite to explore the impossible depths of projection art

Via The Verge

----

box

Screen_shot_2013-09-24_at_09

It's the work of San Francisco studio Bot & Dolly, which believes its new technology can "tear down the fourth wall" in the theater. "Through large-scale robotics, projection mapping and software engineering, audiences will witness the trompe l'oeil effect pushed to new boundaries," says creative director Tarik Abdel-Gawad. "We believe this methodology has tremendous potential to radically transform visual art forms and define new genres of expression." Box is an effective demonstration of the studio's projection mapping system, but it works in its own right as an enthralling piece of art.

Posted by Christian Babski in Data visualisation, Innovation&Society at 09:41

Defined tags for this entry: art, data visualisation, innovation&society

17944 hits

Saturday, June 29. 2013

Jumbled Up, No Detection Font Prevents Infringement of Privacy

Via TAXI

-----

Fed up with the NSA’s infringement of privacy, an internet user by the name of Sang Mun has developed a font which cannot be read by computers.

Called ‘ZXX’, which is used by the Library of Congress to state that a document has “no linguistic content”, the font is garbled up in such a way that computers with Optical Character Recognition (OCR) will not be able to recognize it.

Available in four “disguises”, this font uses camouflage techniques to trick the computers of governments and corporations into thinking that no useful information can be collated from people, while remaining readable to the human eye.

The font developer urges users to fight against this infringement of privacy, and has made this font free for all users on his website.

Posted by Christian Babski in Data visualisation, Software at 11:33

Defined tags for this entry: data visualisation, network, privacy, software, surveillance

3167 hits

Thursday, May 16. 2013

Wikipedia Recent Changes Map

Via hatnote

-----

Wikipedia is constantly growing, and it is written by people around the world. To illustrate this, we created a map of recent changes on Wikipedia, which displays the approximate location of unregistered users and the article that they edit.

Unregistered Wikipedia users

When an unregistered user makes a contribution to Wikipedia, he or she is identified by his or her IP address. These IP addresses are translated to the contributor’s approximate geographic location. A study by Fabian Kaelin in 2011 noted that unregistered users make approximately 20% of the edits on English Wikipedia [edit: likely closer to 15%, according to more recent statistics], so Wikipedia’s stream of recent changes includes many other edits that are not shown on this map.

You may see some users add non-productive or disruptive content to Wikipedia. A survey in 2007 indicated that unregistered users are less likely to make productive edits to the encyclopedia. Do not fear: improper edits can be removed or corrected by other users, including you!

How it works

This map listens to live feeds of Wikipedia revisions, broadcast using wikimon. We built the map using a few nice libraries and services, including d3, DataMaps, and freegeoip.net. This project was inspired by WikipediaVision’s (almost) real-time edit visualization.

The Wikipedia Recent Changes Map is open source and available on github.

If you are interested in visualizing Wikipedia, check out the other data resources that are available.

Posted by Christian Babski in Data visualisation, Innovation&Society, Software at 11:39

Defined tags for this entry: data visualisation, innovation&society, software, wiki

2971 hits

Thursday, May 02. 2013

Driving Miss dAIsy: What Google’s self-driving cars see on the road

Via Slash Gear

-----

We’ve been hearing a lot about Google‘s self-driving car lately, and we’re all probably wanting to know how exactly the search giant is able to construct such a thing and drive itself without hitting anything or anyone. A new photo has surfaced that demonstrates what Google’s self-driving vehicles see while they’re out on the town, and it looks rather frightening.

google-car

The image was tweeted by Idealab founder Bill Gross, along with a claim that the self-driving car collects almost 1GB of data every second (yes, every second). This data includes imagery of the cars surroundings in order to effectively and safely navigate roads. The image shows that the car sees its surroundings through an infrared-like camera sensor, and it even can pick out people walking on the sidewalk.

Of course, 1GB of data every second isn’t too surprising when you consider that the car has to get a 360-degree image of its surroundings at all times. The image we see above even distinguishes different objects by color and shape. For instance, pedestrians are in bright green, cars are shaped like boxes, and the road is in dark blue.

However, we’re not sure where this photo came from, so it could simply be a rendering of someone’s idea of what Google’s self-driving car sees. Either way, Google says that we could see self-driving cars make their way to public roads in the next five years or so, which actually isn’t that far off, and Tesla Motors CEO Elon Musk is even interested in developing self-driving cars as well. However, they certainly don’t come without their problems, and we’re guessing that the first batch of self-driving cars probably won’t be in 100% tip-top shape.

Posted by Christian Babski in Data visualisation, Hardware, Programming, Software, Technology at 09:33

Defined tags for this entry: artificial intelligence, car, data visualisation, google, hardware, programming, sensors, software, technology

4039 hits

Thursday, March 28. 2013

Internet Census 2012 - Port scanning /0 using insecure embedded devices

Via Internet Census 2012

-----

Abstract While playing around with the Nmap Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address usage.

All data gathered during our research is released into the public domain for further study.

The full study

Continue reading "Internet Census 2012 - Port scanning /0 using insecure embedded devices"

Posted by Christian Babski in Data visualisation, Innovation&Society at 17:02

Defined tags for this entry: data visualisation, innovation&society, internet, security

2765 hits

(Page 1 of 3, totaling 21 entries) » next page

Quicksearch

Popular Entries

Show tagged entries

3d printing
ai
android
cloud
data visualisation
google
hardware
innovation&society
interface
mobile
network
os
privacy
programming
security
software
tablet
technology
web
wifi

Syndicate This Blog

Calendar

Blog Administration

Open login screen

Tuesday, December 02. 2014

Friday, October 10. 2014

Tuesday, July 08. 2014

Wednesday, April 16. 2014

Monday, March 17. 2014

Monday, October 07. 2013

Saturday, June 29. 2013

Thursday, May 16. 2013

Thursday, May 02. 2013

Thursday, March 28. 2013

Quicksearch

Popular Entries

Categories

Show tagged entries

Syndicate This Blog

Calendar

Blog Administration