Inputs and Instance Variables

There's an antipattern I see in a lot of codebases. This antipattern obstructs our vision and makes code more difficult to reason about. It's everywhere, and its name is attr_reader.

In order to understand why attr_reader can be an antipattern, we need to appreciate that methods, inputs and instance variables are not the same thing. Functional programming has aided in formalizing this antipattern for me, but I've felt it all along.

In the following code, what are first_name and last_name?

def full_name
  first_name + ' ' + last_name
end

You can speculate, but you can't be 100% confident. They're definitely methods, but looking strictly at this code, it's unclear where first_name and last_name are defined. Likely, there are attr_readers somewhere, and they are merely masked for their true values, which are instance variables:

def full_name
  @first_name + ' ' + @last_name
end

It's difficult to know for sure where the values behind first_name and last_name come from. They could be caught by method_missing, in which case there's probably more complexity to fetching these values than appears on the surface. They could be defined on some entirely different object or contain some bizarre side effects in whatever code serves them. Only once the programmer reading full_name has investigated the source of first_name and last_name can they have confidence in knowing what this code does.

So how can we add clarity? Two forms: inputs and instance variables.

We've already seen the instance variable form of circumventing this antipattern:

def full_name
  @first_name + ' ' + @last_name
end

This code disambiguates the source of first_name and last_name. They're instance variables; they come from within this class. There's no arguing, no guessing, they're definitely variables encapsulated within the same object that full_name is defined.

We can now look at full_name and reason about its proximity to first_name and last_name. This code improves the spatial locality of first_name and last_name, meaning we have tightened the distance we need to travel from full_name in order to understand them.

Improving spatial locality means we have reduce the cognitive complexity induced by this method. This is beneficial, especially when picking up new codebases, as it allows the reader to be confident in the code they see. It curtails the number of assumptions they must make about what's going on.

But we can do better than instance variables.

Instance variables improve spatial locality as they inform us that the source of some data is within the context of that class. But it doesn't tell us specifically where the instance variables are defined; we still need to trudge through code within that file. We have a tool that even further improves spatial locality: inputs.

In the following code, we defined first_name and last_name as parameters to the full_name method:

def full_name(first_name, last_name)
  first_name + ' ' + last_name
end

Now we can look at this code, in isolation, and reason about the source of all data. We know exactly where first_name and last_name come from.

This is a functional style of programming, and I won't claim that it's always better. It might feel superfluous to have to pass first_name and last_name into this method in every place the full name was needed. It would make our code more verbose and less expressive. Which would you rather see throughout your codebase: person.full_name or person.full_name(person.first_name, person.last_name)?

But there is still yet another benefit to passing in arguments. Object oriented programmers might sigh when I say this, but this full_name method is now functionally pure. It is functionally pure because at any point, if we call this method with the same first_name and last_name arguments, we will get the exact same return value every time.

Functionally pure methods are easier to test as they require less setup. Whereas with instance variables, the object state must be correct before evoking the full_name method.

I recognize that for some people, using attr_reader is a stylistic choice. They prefer to look at the @-less reference to data, so they use attr_reader everywhere. I personally don't find @ symbols offensive. Their presence indicates a lot to me.

I do believe that attr_reader can be an antipattern, unless the intention is to expose the state of an instance variable to the outside world. We can avoid this antipattern by revealing underlying instance variables, and carefully considering granularity of inputs. Doing so can reduce complexity and dispel the magic of unnecessary abstractions.

Happy coding.

Posted by Mike Pack on 08/04/2015 at 09:38AM

Tags: locality of reference, ruby, antipattern, attr_reader


Don't "Use" Protected Methods in Ruby

For years, I didn't understand protected methods. Not because I didn't care to, but because I couldn't see the practicality. I'd been writing what I thought was quality production software and never needed them. Not once. It also didn't help that most explanations of protected methods evoked flashbacks of my worst classes in college when I realized mid-semester I had no idea what was going on. I'm not sure if that was my fault or the professor's.

Definitions of protected usually go like this: "protected methods can be accessed by other classes in the same package as well as by subclasses of its class in a different package." Uh, what?

I picked a particularly obscure definition above, but it was the third hit on google for "ruby protected methods."

Let's get one thing out of the way early. I'm not saying you shouldn't use protected methods. I'm saying you shouldn't "use" them. As in, deliberately use them with foresight. That's why I put the word in quotes. There are perfectly valid use cases for protected methods, and I'll illuminate one, but this tool should be employed as a refactoring clarification and nothing else.

Let me show you what I mean.

Say we have a Student class. Each student has a first name, last name, and the ability to provide their full name. Because knowing strictly a first or a last name is potentially ambiguous, a student only knows how to answer by their full name, so the first and last name are private methods.

class Student
  def initialize(first_name, last_name)
    @first_name, @last_name = first_name, last_name
  end

  def name
    first_name + ' ' + last_name
  end

  private

  attr_reader :first_name, :last_name
end

Along come the professors and they want to check attendance. They plan to call attendance in alphabetical order by the students' last names. They've asked our company, Good Enough Software LLC, to find a way to sort the students by last name. We promptly tell the professors that we only have access to the students' full names. The professors quickly retort, "don't care, make it good enough."

We got this.

Since we can't call the private method #last_name, sorting by last name is a tricky task. We can't just write the following, where a classroom can sort its students:

class Classroom
  def initialize(students)
    @students = students
  end

  def alphabetized_students
    @students.sort do |one, two|
      one.last_name <=> two.last_name # BOOM
    end
  end
end

Protected methods can't help us here. This code will not work unless the #last_name method is made public. We don't want to introduce ambiguity, so we can't make #last_name public.

We need to refactor, eventually to protected methods.

This is why I say don't "use" protected methods. Using protected methods during the first iteration of a class is like grabbing your sledgehammer because you heard there would be nails. You show up only to realize the thing you'll be hammering is your grandma's antique birdbox. Inappropriate use of protected methods dilutes the intention of the object's API, damaging its comprehensibility. When reading code that utilizes protected methods, I want to be able to assume there is an explicit need for it; public and private would not suffice. Unfortunately, this is seldom the case.

We should never write new code with protected methods. There's simply not a strong case for it.

But they are helpful here. If we instead compare the two student objects directly with the spaceship operator (<=>), then we can let the student objects compare themselves using #last_name. Since private methods are accessible by the object that owns them, maybe that will work? Let's try.

We want the Classroom class to look like the following, comparing student objects instead of the last name of each student.

class Classroom
  def initialize(students)
    @students = students
  end

  def alphabetized_students
    students.sort do |one, two|
      one <=> two
    end
  end
end

The use of the #sort method with a block above is the default behavior, so we can update the the code to eliminate the block:

class Classroom
  def initialize(students)
    @students = students
  end

  def alphabetized_students
    students.sort
  end
end

We now introduce the spaceship operator on the Student class to compare last names of students.

class Student
  def initialize(first_name, last_name)
    @first_name, @last_name = first_name, last_name
  end

  def name
    first_name + ' ' + last_name
  end

  def <=>(other) # ← new
    last_name <=> other.last_name
  end

  private

  attr_reader :first_name, :last_name
end

This code still won't run. The implicit call to last_name works, but the explicit call to other.last_name is attempting to call a private method on the other student object. Only now can protected methods save our metaphorical bacon.

Let's update the Student class to make #last_name protected. This will allow our spaceship method to call other.last_name, because the other object is also a Student.

class Student
  def initialize(first_name, last_name)
    @first_name, @last_name = first_name, last_name
  end

  def name
    first_name + ' ' + last_name
  end

  def <=>(other)
    last_name <=> other.last_name
  end

  protected

  attr_reader :last_name

  private

  attr_reader :first_name
end

Noooow this code works.

So this is why I say we shouldn't "use" protected methods as a general purpose tool. It's strictly a refactoring clarification for cases where we'd like to provide some utility without exposing additional API to the outside world. In our case, we'd like to compare two students without exposing the #last_name method publicly.

Phew, now we have a fighting chance of passing this semester.

Posted by Mike Pack on 05/27/2015 at 02:22PM

Tags: ruby, methods, refactoring


Component-Based Acceptance Testing

Have you heard of page objects? They're awesome. I'll refer to them as POs. They were conceived as a set of guidelines for organizing the actions a user takes within an application, and they work quite well. There are a few shortcoming with POs, however. Namely, the guidelines (or lack thereof) around how to handle pieces of the app that are shared across pages. That's where components are useful.

A component is a piece of a page; a full page is comprised of zero or more components. Alongside components, a page can also have unique segments that do not fit well into a component.

On the modern web, components are more than a visual abstraction. Web components are increasing in usage as frameworks like AngularEmber and React advocate their adoption to properly encapsulate HTML, CSS and JavaScript. If you're already organizing your front-end code into components, this article will feel like a natural fit. Uncoincidentally, the behavioral encapsulation of components within acceptance tests is often the same behavioral encapsulation of components in the front-end code. But I'm getting a little ahead of myself...

Let's quickly recap POs. POs date back to 2004, when originally called WindowDrivers. Selenium WebDriver popularized the technique under the name Page Objects. Martin Fowler wrote about his latest approach to POs in 2013. There's even some interesting academic research on the impacts of POs. Generally speaking, a single PO represents a single page being tested. It knows the details of interacting with that page, for example, how to find an element to click.

Acceptance tests have two primary categories of events: actions and assertions. Actions are the interactions with the browser. Assertions are checks that the browser is in the correct state. The community prefers that POs perform actions on the page, yet do not make assertions. Assertions should reside in the test itself.

To demonstrate POs and components, let's write some acceptance tests around a couple basic interactions with Twitter's profile page, pictured below.

Twitter Profile Page

When clicking the blue feather icon on the top right, it opens a dialog that allows the user to compose a tweet.

Twitter Compose Dialog

For this demonstration, we'll use Ruby, RSpec and Capybara to mimic these interactions in our acceptance tests, but the rules we'll discuss here can be readily translated to other toolsets.

We might start with a PO that looks like the following. This simple PO knows how to visit a profile page, navigate to a user's followers, and begin composing a tweet.

module Page
  class Profile
    include Capybara::DSL

    def navigate(handle)
      visit "/#{handle}"
    end

    def navigate_to_followers
      click_link 'Followers'
    end

    def open_tweetbox
      click_button 'Tweet'
    end
  end
end

The following test uses each part of the above PO.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
  
  before do
    profile_page.navigate('mikepack_')
  end
  
  it 'allows me to navigate to the followers page' do
    profile_page.navigate_to_followers
  
    expect(current_path).to eq('/mikepack_/followers')
  end
  
  it 'allows me to write a new tweet' do
    profile_page.open_tweetbox
  
    expect(page).to have_content('Compose new Tweet')
  end
end

That's pretty much all a PO does. For me, there are a few outstanding questions at this point, but we've largely showcased the pattern. To highlight where POs start breaking down, let's model the "followers" page using a PO.

module Page
  class Followers
    include Capybara::DSL

    def navigate(handle)
      visit "/#{handle}/followers"
    end
  
    def navigate_to_tweets
      click_link 'Tweets'
    end
  
    # Duplicated from Page::Profile
    def open_tweetbox
      click_button 'Tweet'
    end
  end
end

Uh oh, we've encountered our first problem: a user can create a tweet from both the main profile page and from the followers page. We need to share the #open_tweetbox action between these two pages. The conventional wisdom here is to create another "tweetbox page", like the following. We'll move the #open_tweetbox method into the new PO and out of the other POs, and rename it to #open.

module Page
  class Tweetbox
    include Capybara::DSL
  
    def open
      click_button 'Tweet'
    end
  end
end

Our test for the profile page now incorporates the new Tweetbox PO and our code is a whole lot more DRY.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
  let(:tweetbox_page) { Page::Tweetbox.new } # New code
  
  before do
    # Original setup remains the same
  end
  
  it 'allows me to navigate to the followers page' do
    # Original test remains the same
  end
  
  it 'allows me to write a new tweet' do
    tweetbox.open
  
    expect(page).to have_content('Compose new Tweet')
  end
end

We're now up against another conundrum: if both the tweets page and the followers pages have the ability to compose a new tweet, do we duplicate the test for composing a tweet in both pages? Do we put it in one page and not the other? How do we choose which page?

This is where components enter the scene. In fact, we almost have a component already: Page::Tweetbox. I dislike the conventional wisdom to make any portion of a page another PO, like we did with Page::Tweetbox. In my opinion, POs should represent full pages. I believe that whole pages and portions of pages (ie components) carry significantly different semantics. We should treat POs and components differently, even though their implementations are mostly consistent. Let's talk about the differences.

Here are my guidelines for page and component objects:

  1. If it's shared across pages, it's a component.
  2. Pages have URLs, components don't.
  3. Pages have assertions, components don't.

Let's address these individually.

If it's shared across pages, it's a component.

Let's refactor the Page::Tweetbox object into a component. The following snippet simply changes the name from Page::Tweetbox to Component::Tweetbox. It doesn't answer a majority of our questions, but it's a necessary starting place.

module Component
  class Tweetbox
    include Capybara::DSL
 
    def open
      click_button 'Tweet'
    end
  end
end

In the tests, instead of using the sub-page object, Page::Tweetbox, we would now instantiate the Component::Tweetbox component.

Pages have URLs, components don't.

This is an important distinction as it allows us to build better tools around pages. If we have a base Page class, we can begin to support the notion of a URL. Below we'll add a simple DSL for declaring a page's URL, a reusable #navigate method, and the ability to assert that a page is the current page.

class Page
  # Our mini DSL for declaring a URL
  def self.url(url)
    @url = url
  end
 
  # We're supporting both static and dynamic URLs, so assume
  # it's a dynamic URL if the PO is instantiated with an arg
  def initialize(*args)
    if args.count > 0
      # We're initializing the page for a specific object
      @url = self.class.instance_variable_get(:@url).(*args)
    end
  end
 
  # Our reusable navigate method for all pages
  def navigate(*args)
    page.visit url(*args)
  end
 
  # An assertion we can use to check if a PO is the current page
  def the_current_page?
    expect(current_path).to eq(url)
  end
 
  private
 
  # Helper method for calculating the URL
  def url(*args)
    return @url if @url
 
    url = self.class.instance_variable_get(:@url)
    url.respond_to?(:call) ? url.(*args) : url
  end
 
  include Capybara::DSL
end

Our profile and followers POs can now use the base class we just defined. Let's update them. Below, we use the mini DSL for declaring a URL at the top. This DSL supports passing lambdas to accommodate a PO that has a dynamic URL. We can remove the #navigate method from both POs, and use the one in the Page base class.

The profile page, refactored to use the Page base class.

class Page::Profile < Page
  url lambda { |handle| "/#{handle}" }
 
  def navigate_to_followers
    click_link 'Followers'
  end
end

The followers page, refactored to use the Page base class.

class Page::Followers < Page
  url lambda { |handle| "/#{handle}/followers"}
 
  def navigate_to_tweets
    click_link 'Tweets'
  end
end

Below, the test now uses the updated PO APIs. I'm excluding the component test for creating a new tweet, but I'll begin addressing it shortly.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
 
  before do
    profile_page.navigate('mikepack_')
  end
 
  it 'allows me to navigate to the followers page' do
    profile_page.navigate_to_followers
 
    expect(Page::Followers.new('mikepack_')).to be_the_current_page
  end
end

There are a few things happening in the above test. First, we are not hardcoding URLs in the tests themselves. In the initial example, the URL of the profile page and the URL of the followers page were hardcoded and therefore not reusable across tests. By putting the URL in the PO, we can encapsulate the URL.

Second, we're using the URL within a profile_page PO to navigate to the user's profile page. In our test setup, we tell the browser to navigate to a URL, but we only specify a handle. Since our Page base class supports lambdas to generate URLs, we can dynamically create a URL based off the handle.

Third, we assert that the followers page is the current page, using a little RSpec magic. When making the assertion #be_the_current_page, RSpec will call the method #the_current_page? on whatever object the assertion is being made on. In this case, it's a new instance of Page::Followers. #the_current_page? is expected to return true or false, and our version of it uses the URL specified in the PO to check against the current browser's URL. Below, I've copied the relevent code from the Page base class that fulfills this assertion.

def the_current_page?
  expect(current_path).to eq(url)
end

This is how we can provide better URL support for POs. Naturally, portions of a page do not have URLs, so components do not have URLs. (If you're being pedantic, a portion of a page can be linked with a fragment identifier, but these almost always link to copy within the page, not specific functionality.)

Pages have assertions, components don't.

The conventional wisdom suggests that POs should not make assertions on the page. They should be used exclusively for performing actions. Having built large systems around POs, I have found no evidence that this is a worthwhile rule. Subjectively, I've noticed an increase in the expressivity of tests which make assertions on POs. Objectively, and more importantly, is the ability to reuse aspects of a PO between actions and assertions, like DOM selectors. Reusing code between actions and assertions is essential to keeping the test suite DRY and loosely coupled. Without making assertions, knowledge about a page is not well-encapsulated within a PO and is strewn throughout the test suite.

But there is one aspect of assertion-free objects that I do embrace, and this brings us back around to addressing how we manage components.

Components should not make assertions. Component objects must exist so that we can fully test our application, but the desire to make assertions on them should lead us down a different path. The following is an acceptable use of components, as we use it to perform actions exclusively. Here, we assume three methods exist on the tweetbox component that allow us to publish a tweet.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
  let(:tweetbox) { Component::Tweetbox.new }
 
  before do
    profile_page.navigate('mikepack_')
  end
 
  it 'shows a tweet immediately after publishing' do
    # These three actions could be wrapped up into one helper action
    # eg #publish_tweet(content)
    tweetbox.open
    tweetbox.write('What a nice day!')
    tweetbox.submit
 
    expect(profile_page).to have_tweet('What a nice day!')
  end
end

In the above example, we use the tweetbox component to perform actions on the page and the profile PO to make assertions about the page. We've introduced a #have_tweet assertion that should know in which part of the page to find tweets and scope the assertion to that DOM selector.

Now, to showcase how not to use components, we just need to revisit our very first test. This test makes assertions about the contents of the tweetbox component. I've copied it below for ease of reference.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
 
  before do
    profile_page.navigate('mikepack_')
  end
 
  it 'allows me to write a new tweet' do
    profile_page.open_tweetbox
 
    expect(page).to have_content('Compose new Tweet')
  end
end

After converting this test to use the tweetbox component, it would look like the following.

describe 'the profile page' do
  let(:profile_page) { Page::Profile.new }
  let(:tweetbox) { Component::Tweetbox.new }
 
  before do
    profile_page.navigate('mikepack_')
  end
 
  it 'allows me to write a new tweet' do
    tweetbox.open
 
    expect(tweetbox).to have_content('Compose new Tweet')
  end
end

Not good. We're making an assertion on the tweetbox component.

Why not make assertions on components? Practically, there's nothing stoping you, but you'll still have to answer the question: "of all the pages that use this component, which page should I make the assertions on?" If you choose one page over another, gaps in test coverage will subsist. If you choose all pages that contain that component, the suite will be unnecessarily slow.

The inclination to make assertions on components stems from the dynamic nature of those components. In the case of the tweetbox component, pressing the "new tweet" button enacts the dynamic behavior of the component. Pressing this button shows a modal and a form for composing a tweet. The dynamic behavior of a component is realized with JavaScript, and should therefore be tested with JavaScript. By testing with JavaScript, there is a single testing entryway with the component and we'll more rigidly cover the component's edge cases.

Below is an equivalent JavaScript test for asserting the same behavior as the test above. You could use Teaspoon as an easy way to integrate JavaScript tests into your Rails environment. I'm also using the Mocha test framework, with the Chai assertion library.

describe('Twitter.Tweetbox', function() {
  fixture.load('tweetbox.html');

  beforeEach(function() {
    new Twitter.Tweetbox();
  });

  it('allows me to write a new tweet when opening the tweetbox', function() {
    $('button:contains("Tweet")').click();

    expect($('.modal-title').text()).to.equal('Compose new Tweet');
  });
});

By testing within JavaScript, we now have a clear point for making assertions. There is no more confusion about where a component should be tested. We continue to use components alongside POs to perform actions in our acceptance suite, but we do not make assertions on them. These tests will run significantly faster than anything we attempt in Capybara, and we're moving the testing logic closer to the code under test.

Wrapping up

Unsurprisingly, if you're using web components or following a component-based structure within your HTML and CSS, component-based acceptance testing is a natural fit. You'll find that components in your tests map closely to components in your markup. This creates more consistency and predictability when maintaining the test suite and forges a shared lexicon between engineering teams.

Your mileage may vary, but I've found this page and component structure to ease the organizational decisions necessary in every acceptance suite. Using the three simple guidelines discussed in this article, your team can make significant strides towards a higher quality suite. Happy testing! 

Posted by Mike Pack on 04/27/2015 at 08:53AM

Tags: components, page objects, capybara, rspec, testing, acceptance