Monthly Archives: May 2012

YouTube – everywhere?

Yesterday, YouTube turned seven.

I’ve recently become aware of just how pervasive YouTube has become. It’s available on a range of “computer” platforms – desktops and laptops, mobile phones and tablets. I’m able to access it via AppleTV, XBox, and a Sony Bluray player. Friends who recently updated their home media setup have it on their internet-enabled TV as well as via their Virgin Tivo box on the same “system”. Alongside BBC iPlayer, it’s actually more pervasive in the UK across the various devices many of us have in our living rooms, than the broadcast DTV/satellite/cable channels themselves. It’s also noticeably more present across a very broad range of devices, than alternatives like Vimeo which are arguably better at presenting more beautiful, longer HD content on the web itself.

This is both exciting, and also potentially problematic.

For those of us who have been seeing a multichannel, multimedia future ahead for some time, it’s a validation of the success of streaming web video in breaking the monopolies of the existing broadcasters and media companies. Over time, Google has added some tremendous value to YouTube – enabling creators to rapidly upload, perform simple edits, add soundtracks, and share content all within a rich HTML browser experience. It is also easy to reach a wide range of devices simply by ticking the “make this video available to mobile” box on the video management page – Google does all the heavy lifting of transcoding, resizing, and deciding on whether Flash or HTML5 is a better delivery mechanism, etc.

However, at the same time, it’s kind of… well, clunky. In order to consume content from YouTube on any of the platforms I mentioned before, you have to visit a dedicated YouTube widget, app or channel and then navigate around content within that box (oh, and each platform has a slightly different way of presenting that content). It’s not integrated with the viewing experience – I can’t just say to my TV or viewing device, “show me videos of kittens” and have it aggregate between different sources which include YouTube. Not only that, but we all know just how variable YouTube content can be, both in terms of production quality, duration, and the antisocial nature of comments and social interactions around videos. For some of the most popular videos I’ve posted on my channel, I can’t tell you how long I spend on moderating the most unbelievably asinine comments! Oh and, when we consider the increasing use of streaming video online – be it iPlayer, YouTube, Netflix or any other source – we constantly have to consider the impact on available bandwidth. Bandwidth and connectivity are not universal, no matter how much we may wish they were.

The other side of this is the group of voices who will point to the dominance of Google and their influence over brands and advertising. All very well, but I like to remind people that for all of the amazing “free” services we enjoy (Facebook, Twitter, Google and others), we do have to pay with an acceptance of advertising, and/or sharing of some personal data of our choice – or go back to paying cable and satellite providers for their services. It’s really a simple transaction.

I guess I don’t really have a message with this blog entry, other than to share my observation of the amazingly rapid rise of the new media titan(s). If I was going to offer any further thoughts or advice, it would be the following:

  • explore online video services more – you probably have access to them in more places than you think.
  • remember that video you produce may be viewed on any device from the smaller mobile handsets, to a nice HD television – so always try to produce your content at the highest quality setting possible, and let YouTube or the other video hosts do the rest.
  • richly tag and describe your content to make it easier to find. “Video1.mov” tells me nothing.
  • learn about the parameters which control how your content is displayed. I’ve previously written about this; the content is still useful but I should probably create an update.

This omnipresence across platforms is one of the reasons why I’ve started to primarily use my YouTube channel as the canonical source for all of my video content. Previously I’d used Viddler and Vimeo and occasionally posted a clip to Facebook, but now that I am able to post longer movies, I’ve also posted the full videos of various talks that I’d only previously been able to host at Vimeo. I’m not abandoning all other sources, but a focus on one channel makes a certain amount of sense.

 

Digging through what Twitter knows about me

I joined Twitter on February 21, 2007, at exactly 15:14:48, and I created my account via the web interface. As you can see, my first tweet was pretty mundane!

I remember discussing this exciting cool “new Web 2.0 site” with Kim Plowright @mildlydiverting in Roo’s office in Hursley a couple of days before, and before long he, Ian and I were all trying this new newness out. It was just before the 2007 SXSWi, where Twitter really started to get on the radar of the geekerati.

But wait a moment! It’s impossible to pull back more than just over the last 3,000 tweets using the API, so how was I able to get all the way back to 5 years ago and display that tweet when I’ve got over 33,000 of them to my name?

It’s a relatively little-known fact that you can ask Twitter to disclose everything they hold associated with your account – and they will (at least, in certain jurisdictions – I’m not sure whether they will do this for every single user but in the EU they are legally bound to do so). I learned about this recently after reading Anne Helmond’s blog entry on the subject, and decided to follow the process through. I first contacted Twitter on April 24, and a few days later faxed (!) them my identity documentation, most of which was “redacted” by me 🙂 Yesterday, May 11, a very large zip file arrived via email.

I say very large, but actually it was smaller than the information dump that Anne received. Her tweets were delivered as 50Mb of files, but mine came in nearer to 9Mb zipped – 17Mb unzipped. I’d expected a gigantic amount of data in relation to my tweets, but it seems as though they have recently revised their process and now only provide the basic metadata about each one rather than a whole JSON dump.

So, what do you get for your trouble? Here’s the list of contents, as outlined by Twitter’s legal department in their email to me.

– USERNAME-account.txt: Basic information about your Twitter account.
– USERNAME-email-address-history.txt: Any records of changes of the email address on file for your Twitter account.
– USERNAME-tweets.txt: Tweets of your Twitter account.
– USERNAME-favorites.txt: Favorites of your Twitter account.
– USERNAME-dms.txt: Direct messages of your Twitter account.
– USERNAME-contacts.txt: Any contacts imported by your Twitter account.
– USERNAME-following.txt: Accounts followed by your Twitter account.
– USERNAME-followers.txt: Accounts that follow your Twitter account.
– USERNAME-lists_created.txt: Any lists created by your Twitter account.
– USERNAME-lists_subscribed.txt: Any lists subscribed to by your Twitter account.
– USERNAME-lists-member.txt: Any public lists that include your Twitter account.
– USERNAME-saved-searches.txt: Any searches saved by your Twitter account.
– USERNAME-ip.txt: Logins to your Twitter account and associated IP addresses.
– USERNAME-devices.txt: Any records of a mobile device that you registered to your Twitter account.
– USERNAME-facebook-connected.txt: Any records of a Facebook account connected to your Twitter account.
– USERNAME-screen-name-changes.txt: Any records of changes to your Twitter username.
– USERNAME-media.zip: Images uploaded using Twitter’s photo hosting service (attached only if your account has such images).
– other-sources.txt: Links and authenticated API calls that provide information about your Twitter account in real time.

Of these, let’s dig a bit more deeply into just a few of the items, no need to pick everything to pieces.

The “tracking data” is contained in andypiper-devices.txt and andypiper-ipaudit.txt – interesting. The devices file essentially contains information on my phone, presumably for the SMS feature. They know my number and the carrier. The IP address list tracks back to the start of March, so they have 2 months of data on what IPs have been used to access my account. I’ve yet to subject that to a lot of scrutiny to check where those are located, that’s another script I need to write.

I took a look at andypiper-contacts.txt and was astonished to find out how much of my contact data Twitter’s friend finder and mobile apps had slurped up. I mean, I don’t even have all of this in my address book… given the fact that the information contained the sender email addresses for various online retailer newsletters, I’m guessing that Google’s API (I’m a Gmail user) probably coughed up not just my defined contact list, but also all of the email addresses from anyone I’d ever heard from, ever.

Fortunately, there’s a way to remove this information permanently, which Anne has written about. I went ahead and did that, and then Twitter warned me that the Who To Follow suggestions might not be so relevant. That’s OK because I don’t use that feature anyway – and in practice, I’ve noticed no difference in the past 24 hours!

I use DMs a lot for quick communication, particularly with colleagues (it was a pretty reliable way of contacting @andysc when I needed him at IBM!). That’s reflected in the size of andypiper-dms.txt, which is also a scary reminder to myself that I used to delete them, but since Twitter now makes it harder to get to and delete DMs, I’ve stopped removing them and there’s a lot of private data I wish I’d scrubbed.

Taking a peek at the early tweets in andypiper-tweets, I’m trying to remember when the @reply syntax was formalised and when Twitter themselves started creating links to the other person’s profile. Many of my early tweets refer to @roo and @epred and I don’t think they ever went by those handles. 5 years is a long time.

I mentioned that the format used to deliver the data appears to have changed since Anne made her request. She got a file containing a JSON dump of each tweet including metadata like retweet information, in_reply_to, geo, etc etc.. By comparison, I now have simply creation info, status ID (the magic that lets you get back to the tweets via web UI), and the text itself:

********************
user_id: 786491
created_at: Wed Feb 21 15:43:54 +0000 2007
created_via: web
status_id: 5623961
text: overheating in an office with no 
comfort cooling or aircon. About to drink water.

It’s a real shame that they have taken this approach, as it means the data is now far more cumbersome to parse and work with. However, using some shell scripts I did some simple slicing-and-dicing because I was curious how my use of Twitter had grown over time. Here’s a chart showing the numbers of tweets I posted per year (2012 is a “to date” figure of course). It looks like it was slow growth initially but last year I suddenly nearly doubled my output.

Still considering what other analysis I’d like to do. I can chart out the client applications I’ve used, or make a word cloud showing how my conversational topics have changed over time… now that all of the information is mine, that is. It is just a shame I have to do so much manual munging of the output beforehand.

Oh, and the email I received from Twitter Legal also said:

No records were found of any disclosure to law enforcement of information about your Twitter account.

So, that’s alright then…

Why did I do this? firstly, because I believe in the Open Web and ownership of my own data. Secondly, because I hope that I’ll now be able to archive this personal history and make it searchable via a tool like ThinkUp (which I’ve been running for a while now, but not for the whole 5 years). Lastly… no, not “because I could”… well OK at least partly because I could… because I believe that companies like Twitter, Facebook, Google and others should be fully transparent with their users and the data they hold, and that by going through this currently-slightly-painful procedure it will encourage Twitter to put in place formal tools to provide this level of access to everyone in a frictionless manner.

If you’ll excuse me, I’m off to dig around some more…

Geekery in 8-bits and more

In which I get misty-eyed and nostalgic, geek out over electronics, and think about mobile and the cloud.

Then

On Saturday I went along to the Horizons 30th anniversary of the ZX Spectrum event, organised by Paul Squires and Leila Johnston and held at the BFI in London. The event ran on both days but I wasn’t able to stay on the Sunday, so I missed at least half of the fun!

Steven Goodwin reads Sinclair User

Although I’m full of nostalgia for the 8-bit era, I have to confess I never actually owned a Speccy or any Sinclair hardware. My friends did, but I was primarily an Acorn enthusiast and our first home computer was an Electron (although the first computer I used at primary school was a Commodore PET).

I fondly remember some of the hacks I did on/with/to the Electron, including soldering a pair of headphones into the motherboard to avoid annoying my parents with the music from various Superior Software titles 🙂

Regardless of “allegiance”, Horizons was a really great day. Highlights for me included a fantastic history of computing by PJ Evans from The National Museum of Computing at Bletchley Park (if you haven’t been there yet, you should visit!); Spectranet, an Ethernet adapter for the Spectrum which had me wanting one for no good excuse that I can come up with; and the mind-blowing live composition of a chip tune by Matt Westcott which I saw, but I struggled to comprehend. Matt’s ability to reverse engineer a tune in his head was remarkable.

Oh, and if you haven’t downloaded or bought MJ Hibbett‘s Hey Hey 16k yet, or at least streamed it, you really should.

aside: since Horizons was part of SciFi London, I tried to get Micro Men director Saul Metzstein to drop some hints about his upcoming S7 Dr Who episodes. All he would say was that the western episodes were filmed in Spain (knew that), and that the script for the Christmas episode hasn’t been written yet (didn’t know that).

Now

Components

After the event on Saturday evening, I found it a real struggle to avoid crazy, nostalgia-fuelled eBay purchases, but I did manage to resist! Instead, I resolved to finally get around to building the Fignition I’d picked up at the Hack to the Future event a couple of months ago.

For those who are not familiar with it, the Fignition is a credit card sized build-it-yourself 8-bit computer based around the ATMega chip (the same one used in the Arduino and Nanode Open Source hardware boards). It’s really a remarkable little device – I guess it took me about an hour to assemble and solder, although your mileage may vary. The build guide is excellent and very clear. After performing a couple of power on tests with and without the ICs inserted, it was time to connect up to the TV – and it worked first time. It boots into a simplified Forth environment, which was reminiscent of that BBC BASIC> prompt I am so familiar with from my childhood. The only real downside is that the keyboard – built from 8 clicker buttons – is a bit fiddly to get to grips with, but hey – I just assembled a complete 8-bit computer including video out and keyboard! It’s hard not to be excited.

The board I built was a RevD – the new RevE board has onboard audio in/out (get ready for some fun loading stuff from audio cassettes, again!), and is also slightly modified so that in principle, it is possible to add Arduino-footprint shields. That’s kind of cool, as it means that it might be possible to add a PS/2 keyboard or a network interface.

Ready to test!

What’s “the point” of something so simple, by today’s standards? Well, actually – the simplicity. I went from a bag of components, to a fully working computer in the palm of my hand – no surface-mount components – to a programmable device. It’s “primitive” by the standards of today’s machines, but it’s not that hard to understand how an 8-bit “brain” works, in comparison to the 32 or 64-bit mulitcore CPUs and GPUs in modern laptops and mobile phones. In my opinion, the Fignition, Arduino and Nanode fulfil an important role in helping youngsters to understand the basic principles of electronics and computing.

Next

Last night I headed along to the fantastic Mozilla offices in London.

Mozilla Space, London

The main LJC event was Simon Maple from IBM showing off the new WebSphere 8.5 Liberty Profile running on a Raspberry Pi. I’d hooked Simon up with Sukkin Pang recently so that he could get one of the smart enclosures he provides for the Pi. It was pretty cool to see a full Java app server running on such a small computer – actually almost exactly the same size as the Fignition, only considerably more powerful of course.

The whole talk was live streamed on Mozilla Air – but if you missed it, there’s a video available (complete with semi-professional heckling from yours truly!)

Boot 2 Gecko

What stole the evening for me, though, was two other glimpses of what lies ahead. First, Tom Banks from IBM Hursley came on stage after Simon and showed off the Liberty profile running on a mobile phone. Let me clarify – he was running Android 2.3 on a Nexus One (an “old” phone), running Ubuntu Linux as a virtual image inside of that, and WebSphere inside of that. Kind of mind-blowing! A proof-of-concept and arguably not very useful… not sure when I would want to put a full JEE app server in a phone… but extremely cool. Finally, @cyberdees let Tom and I have a play with Boot to Gecko – Mozilla’s new mobile play. B2G was something I’d heard about, but not touched. I have to say that even in an early form, it’s looking very slick, boots extremely fast – much more quickly than any Android or iOS device I’ve seen – and the device integration (GPS, camera, access to hardware settings, etc) was impressive.

With the Open Web as the platform, ubiquitous mobile devices, and increasingly sophisticated cloud-based backends to interact with, the future is looking pretty cool.

Unshaved yaks with MonoDevelop (and some pre-shaved ones, too)

This yak is ready to go!There’s a cool Cloud Foundry fan site called preshavedyak.com – and last week at SourceDevCon London, we challenged a bunch of developers to earn themselves a nice new preshavedyak hoodie by registering for a Cloud Foundry beta account and seeing how quickly they could get a “hello world” app up-and-running in the cloud. The event saw a bunch of new signups and some great discussions.

The “pre-shaved yak”, of course, is one aspect of what a polyglot open source PaaS is all about – delivering a ready-made, ready-to-host, application runtime environment. We shaved the yak, so you can just go ahead and get productive with your development tool of choice, be that vi or emacs, Notepad or TextMate, or Eclipse / a.n.other IDE. Grab a micro Cloud Foundry VM image and take your pre-shaved yak with you when you’re not connected! 🙂

I actually started to write this post in order to comment on something that’s a bit more hairy that, though! I’ve been playing around a little bit with MonoDevelop and ASP.NET (for reasons that will become apparent during this week, I suspect). I’m using the current stable Mono (2.10) and MonoDevelop (2.8) packages on Lion, and they seem to work well. I’ve also recently been learning about Sinatra, the lightweight web framework for Ruby, and one of the node.js equivalents called Express. It turns out that the .NET world has a bunch of Sinatra-wannabes, the most popular of which appears to be Nancy (see what they did there…? dive into the world of Sinatra-themed name-related web frameworks…!).

Nancy’s site recommends installation via NuGet, which is evidently really well integrated into Visual Studio (NuGet is the equivalent of gem in Ruby, or npm in node.js). Unfortunately there’s no MonoDevelop equivalent. Here’s where the yak shaving started! The NuGet FAQ claims that the command line NuGet.exe will run and can be compiled under Mono, but in my experience, that’s not quite true – I could not get the source to compile in MonoDevelop on OS X. I grabbed the pre-compiled version and followed the instruction to get it to update itself (basically you just run it, and it bootstraps and downloads the latest available)… that went fine, but after that, it would no longer work and produced a huge stack trace.

So here, after getting most of a yak’s fleece all over me, is the secret. The prebuilt NuGet.exe will work under Mono on OS X, but it does require a Windows .NET 4.0 DLL (Microsoft.Build.dll) to be in the same directory / locatable in the path – I grabbed mine from my Windows VM install. It also requires that you tell Mono to present a v4.0 runtime. So I whipped up a tiny script to avoid having to type a bunch of paths and switches each time.

Further results of this recent dalliance in .NET land will be coming soon…