Tag Archives: web

A simple website on Cloud Foundry

I’ve been remiss in blogging since switching job roles, so it’s about time to change that!

One of the goals of a Platform as a Service (PaaS) – like Cloud Foundry – is to enable developers to be more productive, more quickly. It’s about getting out of the way, removing the barriers and setup steps, and enabling developers to write and deploy great code as quickly as possible.

Something I’ve needed to do fairly often since starting work with Cloud Foundry is to quickly put up a “static” web site. The platform supports a number of runtimes and frameworks (Java, Ruby, node.js etc) but it doesn’t currently[1] have an runtime type of “website”. So, I can’t simply put together a bunch of HTML, CSS, images and client-side Javascript files, run vmc push, and have my site online on cloudfoundry.com – I need an “application” to serve the web content.

That’s exactly what my sinatra-static-web project does for me. I’ve found that it’s a very handy and quick template application which enables me to get simple static sites up on Cloud Foundry, and a good starting point to build out from if I want to stretch my Ruby skills 🙂

To use it, simply fork or clone the project using Git; replace the entire contents of the public directory with your HTML, CSS and JS files (with an index.html file as the main page); potentially adjust a couple of settings in the web.rb file; and vmc push the app. You can take a look at the sample site I’ve added to the app, of course… it’s just a load of junk content based on Twitter Bootstrap and with some random Lorem Ipsum-style text to fill it out.

There’s no real need to go near the code, and it is trivial at any rate – but let’s take a quick look.

# a super-trivial Sinatra-based webserver
# for static content
require 'sinatra'

# set all the settings!

configure do
  # this is arguably not necessary... 'public'
  # folder is the static content location by default
  set :public_folder, 'public'

  # optionally configure Cache-Control headers on responses
  # set :static_cache_control, [:public, :max_age => 300]

  # if using mime types not known to Sinatra, uncomment and
  # configure here (by file extension)
  # mime_type :foo, 'text/foo'
end

# serve the files!

# route to starting page (index.html)
get "/" do
  redirect '/index.html'
end

# route to custom error page (404.html)
not_found do
  redirect '/404.html'
end

The code uses the super-handy Sinatra framework for Ruby, which allows an application with multiple URLs to be defined very quickly. In this case, we simply declare a dependency on Sinatra; set the public folder as the one where the static content resides; and then create a default route, so that when a user hits our root URL / they are redirected to the index.html file. We also create an error route so that if the user hits a URL that doesn’t exist, they receive a customised but simple 404 error page (assuming that such a file exists in the public folder!).

As you can see, there’s really only a few lines of code here, and the rest is handled by the framework. I’ve commented out a couple of optional parameters that can be used if desired, but without any changes this will serve the contents of the public folder perfectly happily.

I’ve used this a few times now, for sites of varying levels of complexity – in particular the resources site I created for Cloud Foundry’s sponsorship of Young Rewired State was based on this (the source code is on Github if you want to take a look at that, too – it’s understandably extremely similar!). I was also able to use it to help a number of students who I worked with at YRS 2012 to get their sites online. More on YRS, shortly…

Just a simple little resource that you might find handy for prototyping your next web UI – you don’t even need to know Ruby, Java, or node.js to get going!

[1] … note that I’m not saying that Cloud Foundry should have or will have such a type of container in the future – but the code base is Open Source, so there’s every chance that someone will come along and add this kind of thing one day!

Update 05 March 2013: I just pushed a few changes to the app to reflect a slight change in the way Sinatra apps work on Cloud Foundry now. Use the source!

Advertisements

Digging through what Twitter knows about me

I joined Twitter on February 21, 2007, at exactly 15:14:48, and I created my account via the web interface. As you can see, my first tweet was pretty mundane!

I remember discussing this exciting cool “new Web 2.0 site” with Kim Plowright @mildlydiverting in Roo’s office in Hursley a couple of days before, and before long he, Ian and I were all trying this new newness out. It was just before the 2007 SXSWi, where Twitter really started to get on the radar of the geekerati.

But wait a moment! It’s impossible to pull back more than just over the last 3,000 tweets using the API, so how was I able to get all the way back to 5 years ago and display that tweet when I’ve got over 33,000 of them to my name?

It’s a relatively little-known fact that you can ask Twitter to disclose everything they hold associated with your account – and they will (at least, in certain jurisdictions – I’m not sure whether they will do this for every single user but in the EU they are legally bound to do so). I learned about this recently after reading Anne Helmond’s blog entry on the subject, and decided to follow the process through. I first contacted Twitter on April 24, and a few days later faxed (!) them my identity documentation, most of which was “redacted” by me 🙂 Yesterday, May 11, a very large zip file arrived via email.

I say very large, but actually it was smaller than the information dump that Anne received. Her tweets were delivered as 50Mb of files, but mine came in nearer to 9Mb zipped – 17Mb unzipped. I’d expected a gigantic amount of data in relation to my tweets, but it seems as though they have recently revised their process and now only provide the basic metadata about each one rather than a whole JSON dump.

So, what do you get for your trouble? Here’s the list of contents, as outlined by Twitter’s legal department in their email to me.

– USERNAME-account.txt: Basic information about your Twitter account.
– USERNAME-email-address-history.txt: Any records of changes of the email address on file for your Twitter account.
– USERNAME-tweets.txt: Tweets of your Twitter account.
– USERNAME-favorites.txt: Favorites of your Twitter account.
– USERNAME-dms.txt: Direct messages of your Twitter account.
– USERNAME-contacts.txt: Any contacts imported by your Twitter account.
– USERNAME-following.txt: Accounts followed by your Twitter account.
– USERNAME-followers.txt: Accounts that follow your Twitter account.
– USERNAME-lists_created.txt: Any lists created by your Twitter account.
– USERNAME-lists_subscribed.txt: Any lists subscribed to by your Twitter account.
– USERNAME-lists-member.txt: Any public lists that include your Twitter account.
– USERNAME-saved-searches.txt: Any searches saved by your Twitter account.
– USERNAME-ip.txt: Logins to your Twitter account and associated IP addresses.
– USERNAME-devices.txt: Any records of a mobile device that you registered to your Twitter account.
– USERNAME-facebook-connected.txt: Any records of a Facebook account connected to your Twitter account.
– USERNAME-screen-name-changes.txt: Any records of changes to your Twitter username.
– USERNAME-media.zip: Images uploaded using Twitter’s photo hosting service (attached only if your account has such images).
– other-sources.txt: Links and authenticated API calls that provide information about your Twitter account in real time.

Of these, let’s dig a bit more deeply into just a few of the items, no need to pick everything to pieces.

The “tracking data” is contained in andypiper-devices.txt and andypiper-ipaudit.txt – interesting. The devices file essentially contains information on my phone, presumably for the SMS feature. They know my number and the carrier. The IP address list tracks back to the start of March, so they have 2 months of data on what IPs have been used to access my account. I’ve yet to subject that to a lot of scrutiny to check where those are located, that’s another script I need to write.

I took a look at andypiper-contacts.txt and was astonished to find out how much of my contact data Twitter’s friend finder and mobile apps had slurped up. I mean, I don’t even have all of this in my address book… given the fact that the information contained the sender email addresses for various online retailer newsletters, I’m guessing that Google’s API (I’m a Gmail user) probably coughed up not just my defined contact list, but also all of the email addresses from anyone I’d ever heard from, ever.

Fortunately, there’s a way to remove this information permanently, which Anne has written about. I went ahead and did that, and then Twitter warned me that the Who To Follow suggestions might not be so relevant. That’s OK because I don’t use that feature anyway – and in practice, I’ve noticed no difference in the past 24 hours!

I use DMs a lot for quick communication, particularly with colleagues (it was a pretty reliable way of contacting @andysc when I needed him at IBM!). That’s reflected in the size of andypiper-dms.txt, which is also a scary reminder to myself that I used to delete them, but since Twitter now makes it harder to get to and delete DMs, I’ve stopped removing them and there’s a lot of private data I wish I’d scrubbed.

Taking a peek at the early tweets in andypiper-tweets, I’m trying to remember when the @reply syntax was formalised and when Twitter themselves started creating links to the other person’s profile. Many of my early tweets refer to @roo and @epred and I don’t think they ever went by those handles. 5 years is a long time.

I mentioned that the format used to deliver the data appears to have changed since Anne made her request. She got a file containing a JSON dump of each tweet including metadata like retweet information, in_reply_to, geo, etc etc.. By comparison, I now have simply creation info, status ID (the magic that lets you get back to the tweets via web UI), and the text itself:

********************
user_id: 786491
created_at: Wed Feb 21 15:43:54 +0000 2007
created_via: web
status_id: 5623961
text: overheating in an office with no 
comfort cooling or aircon. About to drink water.

It’s a real shame that they have taken this approach, as it means the data is now far more cumbersome to parse and work with. However, using some shell scripts I did some simple slicing-and-dicing because I was curious how my use of Twitter had grown over time. Here’s a chart showing the numbers of tweets I posted per year (2012 is a “to date” figure of course). It looks like it was slow growth initially but last year I suddenly nearly doubled my output.

Still considering what other analysis I’d like to do. I can chart out the client applications I’ve used, or make a word cloud showing how my conversational topics have changed over time… now that all of the information is mine, that is. It is just a shame I have to do so much manual munging of the output beforehand.

Oh, and the email I received from Twitter Legal also said:

No records were found of any disclosure to law enforcement of information about your Twitter account.

So, that’s alright then…

Why did I do this? firstly, because I believe in the Open Web and ownership of my own data. Secondly, because I hope that I’ll now be able to archive this personal history and make it searchable via a tool like ThinkUp (which I’ve been running for a while now, but not for the whole 5 years). Lastly… no, not “because I could”… well OK at least partly because I could… because I believe that companies like Twitter, Facebook, Google and others should be fully transparent with their users and the data they hold, and that by going through this currently-slightly-painful procedure it will encourage Twitter to put in place formal tools to provide this level of access to everyone in a frictionless manner.

If you’ll excuse me, I’m off to dig around some more…

European WebSphere Technical Conference 2011

Although I realise that it seems as though I do little other than spin around “the conference circuit” at the moment what with the various events I’ve blogged about lately, that isn’t entirely true! However, it is just about time for another European WebSphere Technical Conference – something like a cut-down IMPACT run in Europe, a combination of the popular WebSphere and Transaction & Messing conferences we used to run – with plenty of technical content on the latest technologies.

I’ll be in Berlin next week 10th-14th October, participating in at least one panel, speaking about MQTT, and also covering the latest on IBM MQ messaging technologies as they relate to cloud and web. There’s a Lanyrd event page where I’ll try to collate information relating to the individual talks.b

I have a feeling that by this time next week there could be quite a lot to talk about… 🙂

Historical perspectives

For those of you who have never read my About page, you may be surprised to know that as well as being a “techie”, I’m MA in Modern History (the story of how I came to have a career in technology is possibly less interesting than it might outwardly appear). As such, I wanted to take a moment to comment on a couple of things that have come up in the past week.

History teaching in the UK

I don’t remember my first history lesson, how I became aware of my own cultural background, or when or why I fell in love with the study of history. I just remember, when I came to choose exam subjects at 13/14, that for me History was a no-brainer, something I thoroughly enjoyed and wanted to dive deeper into. Despite my affinity for and interest in science (I was working on some Chemistry software for RISC OS with a friend of mine at the time), it was also a natural study for me to pursue into A-level and, eventually, as my Degree subject.

I won’t claim that the transition to a technical career was straightforward. It’s true that while (in my opinion) a History graduate has a range of flexible and totally transferable skills, recruitment out of universities in the UK 15 years ago (and, I suspect, even more so today) was limited in outlook. Although I’d a number of examples of technical knowledge and had my own business selling RISC OS software with a friend, many larger organisations simply wanted a science education, and I didn’t have one to show them. I was grateful of the UK Post Office taking a broader view of my skill set and taking me on as an IT Graduate (or, one of the “Graduates in IT Services”… yes, you work out that acronym… charming!).

Back to the subject though. Academically, philosophically, politically, and in the pursuit of knowledge and understanding, I believe that History is vitally important. What did I gain from devoting a number of years of my life to that study? Strong analytical skills spanning multiple media; broad and I believe, sensitive, cultural awareness (yes, really – from a Brit!); and an understanding of how we became the human race we are today.

Facts about history education in the UK :-(

This past week, Professor Niall Ferguson published an editorial piece in the Guardian claiming that British history teaching was at a point of crisis.

[aside: Niall Ferguson is the best lecturer I ever had… I clearly remember his first lecture to my fellow students and I, which began with the clanging industrial noises of Wagner’s Ring Cycle, immediately capturing the attention of even the most feckless and disinterested mid-90s Oxford student (although my female colleagues seemed captured not so much by the audio, but by the visuals and voice…)]

I was disappointed to read about the state of affairs described in his piece, and the accompanying article describing the loss of cohension in the UK History curriculum. Now let’s be clear – to an extent, I was always in a privileged position with regard to education generally and to History education as well. If things are really in such dire straits today I do despair – I don’t get the same sense of ignorance from friends of other nationalities, and whilst I don’t advocate any kind of imperialist triumphalism in British History education, by ignoring trends, and what Niall Ferguson calls the “long arc of time”, our children clearly miss out. I’m not going to trot out cliches about how we have to understand past mistakes to avoid repeating them – we do that regardless, it’s part of the human condition and pride. The point is: there’s excitement and interest in our story. And honestly, how annoyed would you be if every story you ever heard, read, listened to or attempted to understand, arrived in disjointed pieces that were impossible to lace together?

I hope the UK teaching profession, and the appropriate education authorities, listen to reason. And I hope that the apparent focus on science as the be-all-and-end-all of education learns to flex in favour of other subjects, too – speaking as a STEM Ambassador, myself.

History on the web

I’ve remarked before about the web as a historical source. The death of archive services like DejaNews (it was the archive for Usenet, and finally bought by Google, which turned it into Google Groups, before burying / de-emphasising access to older content) was a terrible thing, even if it does mean that it is now very difficult to locate evidence of my embarrassing mid-teen and early 20s days online! The move to the real-time web, and the increasing focus on sites like Twitter and Facebook (through which historical seach is both de-emphasised, and technically virtually impossible), is increasingly reducing the value of the  web as a historical resource.

Suw Charman has written about this issue this week, and it caught my attention particularly in the context of the other issues currently exercising my brain.

I return to a thought I’ve expressed previously: sites that revolve around EVENTS have an opportunity here. When I wrote about Lanyrd I said:

here’s what I think is a really cool feature. You can attach all kinds of “coverage” to an event, be it slides, audio, video, liveblogged information, blogged write-ups, etc etc. So your point-in-time event suddenly gains a social and historical footprint with an aggregation of all the content that grew up around it, which people can go back to.

The thing that really grabbed my attention this week was the seemingly-minor and gimmicky discovery that someone has created an entry for the 1945 Yalta meetingsh on Lanyrd. This is awesome – a demonstration of what it can provide, and what we need – the ability to tie content together and aggregate, link, and retain related information in the context of people and events. All of which is only really interesting if we have a population that understands where we (globally) have come from…

Daddy, where did the Internet come from?

I’m a big fan of podcasts. As a podcaster myself, you might expect me to say that. I know many people are not fans, and that’s OK – it’s a matter of taste, I think. For me, it’s convenient to be able to get information while I’m driving, or travelling via some other means or doing something else which makes reading difficult. I like some of the insight that comes out through deeper discussion of a topic, or even from the interaction of several people in a conversation, which you typically don’t get from a written post which is likely to be from one point of view. Audio can take more concentration than reading text, of course, and is difficult to scan, so I can understand objections – like I said, it’s a matter of taste. For me, podcasts need to be interesting, and ideally they need to be short (45 mins max) and easy to consume[1].

One particular podcast series which I came across recently (via epredator) is an excellent series of short pieces from the Open University – it’s called The Internet at 40 (iTunes link). It looks at the origins of the Internet and then covers a series of interviews with some of the pioneers like Vint Cerf and Tim Berners-Lee as well as less well-known people like Donald Davies and Ray Tomlinson. It’s mostly delivered in nice bite-sized 5-15 minute chunks, with only the first piece lasting longer than 20 minutes, and even then, that’s a compelling listen.

Ever wanted to know how this thing called the Internet evolved? I found it fascinating to listen to Donald Davies talking about the genesis of TCP/IP – I’d always understood it at a general level, but hearing these guys discuss the original thinking behind some of the fundamental concepts was really cool. As both an historian and a techie, it was great to listen and see my two worlds collide. Recommended.

[1] the one exception I make to the 45-minute rule are the shows from TWiTMacBreak Weekly and net@night are regular subscriptions, and the latter in particular is great for making new online discoveries. If you have the stamina for something a little longer, the TWiT network has some great shows.