Tim Bauer’s Running Thoughts

Semi-daily webcast summaries/insights

Cal Henderson (Flickr): Automate, Automate, Automate

If you are wanting to hear how Flickr develops and manages their website and client software this is the webcast for you. That or if you want to see a picture of a kitty (you get 2). Doesn’t that cover everyone?

As usual my raw notes are below, but here are the thoughts that jumped out at me as I ran (i.e. jogged) along.

Details Notable Points
Title/Link:

Duration:

  • ~45m

Speakers:

Recommend to Watch? Maybe

  • The material was a bit dry but Cal brought out some points, to me, at various stages at the talk that were hidden gems. If you own similar responsibilities in your patch of the world I would watch this (and his other discussions around scaling).
1. Commit all the time … to PROD ?!?! AMEN!

  • Are your hands sweating yet? Cal talked about how @ Flickr they are constantly pushing to PROD as in daily, hourly, etc. They do it via a scheme of configuration settings that enable and disable functions / features on specific boxes regions. In effect, there is latent code in PROD at flickr growing till they activate it. Amusing as I am working with a client right now where I was pushing this approach and getting push back. At least I don’t feel totally off my rocker now. Cal has my back.

2. Controls Via Yelling, Then IM, Then Something …

  • He talked about how Flickr evolved from just bellowing in their area on greenlighting a push to PROD, to IM, then to a tools based approach that they built. I found it to be good counsel. People tend to try and over automate early and bog themselves down. The key is making the pushes to PROD simple … which in small teams is a bit of talking in many cases.

3. SVG Lover … What No FLEX?

  • He gave love to SVG as their approach to charting, trending various statistics on their rigs @ Flickr. But, i assume, he probably should / would take a gander at FLEX if he had to do it over again as it doesn’t have the support issues of SVG (no Batik) and a far more powerful charting library.

4. Admin Everything … God Mode

  • He showed how they have “God Mode” on all pages in Flickr. From what I could tell … they enable a process where an admin can go into a page and see all the system objects supporting it AND edit them. Very nice. Stellent does a similar approach w/ their content management (a few keystrokes and you can edit a page you are viewing). Flickr’s God mode is just more technically focused (database tables, config tables, localization, etc .. all relative to a page).

The talk overall started slow but the points above coupled with his discussions of the toolset in play @ Flickr at the end made it well worth listening to.

** START OF RAW SCRIBBLE TAKEN WHILE RUNNING **

• Did a talk at Webstock … Cal Henderson – arch for flickr
……………..· http://www.iamcal.com/talks/
……………..· his site à http://www.iamcal.com
……………..· his twitter à http://twitter.com/iamcal
• Notes
• Building Big on The Web
• 4/22/2008, 6:08 AM
• Flickr
• Usually he talks about scaling
• Not today
• Today talks about how to build
……………..· Interactive systems
• Over focus on process
……………..· It is important
……………..· XP, Waterfall, Agile, Scrum
……………..· He doesn’t care as much about process (methodology)
• 4/22/2008, 6:11 AM
• Don’t have methods slow down teams
• Todays talk is about tools … what they use and why
• 4/22/2008, 6:12 AM
• Old ways of tools
……………..· Txt editor, vi, emacs
……………..· People still use this … especially personal sites
……………..· Bigger sites can’t do it that way
• More tools w/ bigger
……………..· Release Management
• 4/22/2008, 6:16 AM
• Continuous Integration
……………..○ Martin Fowler
……………..○ Work, commit immediately, trigger test
……………..○ Update … get changes
……………..○ Test constantly
• Tests Are Good, Tests Are Dull
……………..○ On average test coverage is very small (dull, hard to keep up)
• 4/22/2008, 6:17 AM
• How deal w/ that … automate everything
……………..○ Automate tests to hit the trunk
• Mozilla’s Tinderbox
……………..○ Aggregate automated tests on clients … and see in one place
……………..○ Shows time new to old by build (y axis) … by machine (column, x)
• Flickr’s Tinderbox
……………..○ Run the test, about 1000 items, results on web page
……………..○ Wrap test cases on stuff that is most brittle … so most likely or core
……………..○ Run once an hour
……………..○ Email on failure
• Version Control = Blame
……………..○ They email on who changed code since last successful build
……………..○ Force of peer pressure to get fixes
• Continuous Production
……………..○ Example of glass, how it has to run continously to work
……………..○ Flickr calls it continuous deploy … constantly release their software to PROD
…………………………….§ BAUER COMMENT THERE IS A CONTENTIOUS POINT
• Process typical
……………..○ Dev -> qa -> stage -> prod
……………..○ 4/22/2008, 6:23 AM
……………..○ Reality … no QA … dev->stage->prod
…………………………….§ For medium to large sites
• Flickr process
……………..○ Dev, Alpha environments
……………..○ Version control line
……………..○ Staging beta1 beta 2
…………………………….§ Pull from version control
……………..○ Prod
…………………………….§ Comes from staging
• Feature flags, avoid branches
……………..○ Weird feature of flickr
……………..○ Avoid branching
……………..○ The more differences … the harder to merge
……………..○ He is against it based on that
……………..○ New features based on config flags … turn on/off features … environments flags control features what environment works
• Shrinkwrap-ware
……………..○ Process –> alpha, beta, rc, ga
…………………………….§ Rc — close to good enough
…………………………….§ Ga - golden master
……………..○ Box process also adds –> RTM (to cover boxing) … comes before GA
……………..○ Flickr uploader is of this type
…………………………….§ Alpha -> beta -> GA -> Push
……………………………………………□ Push, release force upgrade in PROD by users
• 4/22/2008, 6:27 AM
• Release tools
……………..○ Agile
…………………………….§ Release to PROD quick is the tools we need to enable that
…………………………….§ Makes releases to PROD simpler
…………………………….§ Many times a day / hour
……………..○ One tool –> Yelling between people
…………………………….§ Their first version …
…………………………….§ 2-3 people
……………..○ Another –> Via IM
…………………………….§ Scales a bit larger
……………..○ Deploy Log –> Web page
…………………………….§ Shows lines of change …
…………………………….§ Type into
…………………………….§ Shows tail of a file
…………………………….§ On deploy tools page
• Public deploy log
……………..○ Show beta site code.flickr
……………..○ Follow who breaks what
……………..○ Shows what people are up to
……………..○ Show people Flickr is working
……………..○ Public?
• 4/22/2008, 6:31 AM
• Staging tool
……………..○ Assemble lang pieces
……………..○ Put pieces on staging for testing
……………..○ A whole bunch of text on page
……………..○ Button -> perform staging for end to end process to run … key is one button
…………………………….§ Should be one script
• 4/22/2008, 6:33 AM
• Compile
……………..○ Build web interfaces quickly
……………..○ Ajax checks on compile status
……………..○ Look at file on disk check on progress
……………..○ See on deploy page on where they are in compile
• 4/22/2008, 6:33 AM
• Deploy system
……………..○ Single button again
……………..○ Press (if allowed)
……………..○ Done
……………..○ Button does 300 things … logs success / failure
……………..○ One touch deployment
• What changed from last deploy
• Config deploy
……………..○ Config files
……………..○ Flags change a lot in config files
…………………………….§ Bauer comment - configuration management
……………..○ Process to manage configuration changes
……………..○ Form, edit file … hit button … deploys to PROD boxes
……………..○ Things they do a lot they change to scripts
…………………………….§ Bauer comment — a lot is relative to process … if you are not agile like them what you do a lot is drastically different (so you might automate the wrong thing and think you are fine tuned …. But you forgot the re-engineer step)
• 4/22/2008, 6:37 AM
• Mozilla AUS - Auto Update Services
……………..○ Pings URL from desktop … gets if new version avail and downloads as plugin
……………..○ Http check …
……………..○ Not dependant on using mozilla … they hit the servers of mozila not the client
……………..○ Update scripts …
• 4/22/2008, 6:39 AM
• Development Process
……………..○ Bug Tracking
…………………………….§ Simple summary
…………………………….§ Not 25 fields like @ yahoo! Bug report … short /long tickets … training .. Egads
…………………………….§ Flickr … simpler
……………………………………………□ 2 fields … title / desc
……………………………………………□ Sits on top of a powerful system but doesn’t expose that to reporters
…………………………….§ Track projects
……………..○ Source control viewer
…………………………….§ UVC
…………………………….§ Diffs, Blame Log,
…………………………….§ Critical
…………………………….§ Track to track bugs … then use tracks on source browser
…………………………….§ Link mailing list to source control … mail w/ links to diff viewere
……………………………………………□ Could also do rss feed by dev
…………………………….§ LXR / Indexers
……………………………………………□ LXR - Linux Cross Referencer … Theory … looks at source code and looks at bits of it and see where it is used across the application …. Click on class name … find definition
• 4/22/2008, 6:43 AM
• Maintenance
……………..○ Monitoring
…………………………….§ Nagios. Ugly but awesome. Servers and services up / down
……………………………………………□ Most big sites use
…………………………….§ Ganglia - Gather stats on bits of apps, servers, services
……………………………………………□ Used a lot of @ Flickr
……………………………………………□ Overview by data center
……………………………………………□ Server drill down from data center
……………………………………………□ Color coded
……………………………………………□ Graphs on box stats - cpu, memory, etc
……………………………………………□ Zero config (easy setup)
……………………………………………□ Graphing over time , historical records
…………………………………………………………..® Ie hits on a server
……………………………………………□ Open source software
……………………………………………□ RRD - Round Robin Database Tool
…………………………………………………………..® Fixed space, snapshot of data over time … lose resolution as time goes by … auto drop of data long term
……………………………………………□ Stack stacks from RRD and look at trending and relationships of data
……………………………………………□ They build a custom tool to monitor mysql
…………………………………………………………..® They use that
…………………………………………………………..® Look at threads, logs, select performance … in open source program called DV Stats
…………………………………………………………..® Used for performance tuning … slow down trouble shooting
………………………………………………………………………….◊ Look at ganglia for tablelocks for example (pulling from RRD and mysql stuff)
………………………………………………………………………….◊ Immediate bug fix
……………………………………………□ RRD graphs look the same … time based sample data
…………………………………………………………..® Custom … mysql, svg, batik … views via that
………………………………………………………………………….◊ Svg - scalable vector graphics … define via code
…………………………………………………………………………………………► Bauer comment - FLEX knockoff …
…………………………………………………………………………………………► Limited support
…………………………………………………………………………………………► Batik … converts svg to regular graphics
………………………………………………………………………….◊ So they go from mysql -> svg -> batik
• 4/22/2008, 6:50 AM
• God Tools - Admin Tools
……………..○ Comes from GNE - Game Never Ending (company that built flickr built that then flickr)
…………………………….§ Collected paper game … (bauer comment - hrm)
…………………………….§ Actions performed by god … flickr.com/god
……………..○ Example
…………………………….§ So you admin from website … each page has admin pages tied to it
…………………………….§ In context of site
…………………………….§ See a product in site .. Click link and see / edit relevant files, configs, etc in PROD
……………..○ Cache Checker
…………………………….§ See what is database versus what is cache
…………………………….§ Dumps and troubleshoots data differences
……………..○ Customer care
…………………………….§ Help request via customer profile pages
……………..○ API data
…………………………….§ Graphs of various things (svg of course)
……………..○ Localization
…………………………….§ Multiple lang
…………………………….§ Every string has tags to localized
…………………………….§ Looked at 3rd party for localization … all sucked .. So they build their translation management interface
…………………………….§ So by string they translate
……………..○ Admin profile
…………………………….§ Bug tracker
…………………………….§ Flickr account
…………………………….§ …
…………………………….§ Obsessed w/ tools
……………..○ Very large system
…………………………….§ 32k lines of php
…………………………….§ 24k of html
…………………………….§ Update each time ask or do things … automate as they go
• 4/22/2008, 6:56 AM
• Final points
……………..○ Use robots to automate anything you do … single press of button

** END RAW SCRIBBLE TAKEN WHILE RUNNING **

April 22, 2008 - Posted by bauertim | 2-Perhaps (what floats your boat?) | , , , , , | 1 Comment

1 Comment »

  1. [...] Cal Henderson (Flickr): Automate, Automate, Automate For those out there pondering how to setup a framework of tools to develop applications … Cal [...]

    Pingback by End of Month — Breakdown of Posts for April 08 « Tim Bauer’s Running Thoughts | May 7, 2008

Leave a comment