Refactoring from nested loops to enumerators and lambda's


Being an Old School Programmer I tend to naturally to write ever more deeply nested loops.

I hate myself when I do this because it’s hard to test especially if some of the loops have nasty external side effects, it’s hard to reuse, it’s hard to refactor.

So I’m trying a new pattern….

Walk with me this is going to be long…. the example is a teaching example / dojo exercise for myself, so excuse me it it slightly contrived.

Here is a typical chunk of my code…

def nested( a, b, c) stuff_a = func_a(a) stuff_b = func_b(b) stuff_c = func_c(c) result = {} func_1( stuff_a) do |a1| stuff_d = func_d( a1 + stuff_b) func_2( stuff_d) do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end end end result
end

It’s fairly clear but has a few gotchas.

  • It’s unclear which part of the code actually depends on the parameters.
  • In this toy example, the function is small… but a Real Life nested loop function like this can quickly grow hideously large.
  • As a “premature optimization” I have factored out subexpressions that do not alter within the loops, resulting in large scopes for variables that are only used inside the loops.
  • func_1(), func_2(), func_3() yield a stream of things….but a stream of things should just be an enumerable!

Ok, so try 2… reduce the scope of the stuff_* variables, a pessimation..

def nested( a, b, c) result = {} stuff_a = func_a(a) func_1( stuff_a) do |a1| stuff_b = func_b(b) stuff_d = func_d( a1 + stuff_b) func_2( stuff_d) do |a2| stuff_c = func_c(c) stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end end end result
end

I can extract the inner loop as a function, but my parameter list balloons…

def inner_2( a2, c, result) stuff_c = func_c(c) stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end
end def nested( a, b, c) result = {} stuff_a = func_a(a) func_1( stuff_a) do |a1| stuff_b = func_b(b) stuff_d = func_d( a1 + stuff_b) func_2( stuff_d) do |a2| inner_2( a2, c, result) end end result
end

and I still reevaluate func_c for every loop!

If I use a closure instead, my parameter list collapses again…

def nested( a, b, c) result = {} stuff_a = func_a(a) func_1( stuff_a) do |a1| stuff_b = func_b(b) stuff_d = func_d( a1 + stuff_b) inner_2 = ->( a2){ stuff_c = func_c(c) stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end } func_2( stuff_d) do |a2| inner_2.call( a2) end end result
end

But I still have a pessimization, so if I could pass an enumerator around…. So lets try convert func_2 to an enumerator….

def func_2( j) return to_enum( __method__, j) unless block_given? .....lots of code and a... yield a2 ...lots more code
end def nested( a, b, c) result = {} stuff_a = func_a(a) func_1( stuff_a) do |a1| stuff_b = func_b(b) stuff_d = func_d( a1 + stuff_b) inner_2 = ->( a2){ stuff_c = func_c(c) stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end } func_2( stuff_d).each do |a2| inner_2.call( a2) end end result
end

Then pass the enumerator in, and then we can stop the silly re-evaluation of func_c on every loop…

def nested( a, b, c) result = {} stuff_a = func_a(a) func_1( stuff_a) do |a1| stuff_b = func_b(b) stuff_d = func_d( a1 + stuff_b) inner_2 = ->( e){ stuff_c = func_c(c) e.each do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end end } inner_2.call( func_2( stuff_d)) end result
end

And I can keep going with func_1….

def nested( a, b, c) result = {} stuff_a = func_a(a) inner_1 = ->( e1, inner_2){ stuff_b = func_b(b) e1.each do |a1| stuff_d = func_d( a1 + stuff_b) inner_2.call( func_2( stuff_d)) end } inner_2 = ->( e2){ stuff_c = func_c(c) e2.each do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e) do |a3| result[a3] = func_f( a3) end end } inner_1.call( func_1( stuff_a), inner_2) result
end

I can reduce scope of result and move it to the outermost level….

def nested( a, b, c) stuff_a = func_a(a) inner_1 = ->( e1, inner_2, &block){ stuff_b = func_b(b) e1.each do |a1| stuff_d = func_d( a1 + stuff_b) inner_2.call( func_2( stuff_d), &block) end } inner_2 = ->( e2,&block){ stuff_c = func_c(c) e2.each do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e,&block) end } result = {} inner_1.call( func_1( stuff_a), inner_2) do |a3| result[a3] = func_f( a3) end result
end

I can convert the inner_1 lambda to a vanilla method and convert that to an Enumerator…

def inner_1( b, e1, inner_2, &block) return to_enum( __method__, b, e1, inner_2) unless block_given? stuff_b = func_b(b) e1.each do |a1| stuff_d = func_d( a1 + stuff_b) inner_2.call( func_2( stuff_d), &block) end
end def nested( a, b, c) stuff_a = func_a(a) inner_2 = ->( e2,&block){ stuff_c = func_c(c) e2.each do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e,&block) end } result = {} inner_1( b, func_1( stuff_a), inner_2).each do |a3| result[a3] = func_f( a3) end result
end

Since inner_1 is just a vanilla enum, I can use each_with_object

I can also extract inner_2 as a method that returns a lambda…

def inner_1( b, e1, inner_2, &block) return to_enum( __method__, b, e1, inner_2) unless block_given? stuff_b = func_b(b) e1.each do |a1| stuff_d = func_d( a1 + stuff_b) inner_2.call( func_2( stuff_d), &block) end
end def inner_2( c) stuff_c = func_c(c) ->( e2,&block){ e2.each do |a2| stuff_e = func_e( a2 + stuff_c) func_3( stuff_e,&block) end }
end def nested( a, b, c) stuff_a = func_a(a) inner_1( b, func_1( stuff_a), inner_2( c)).each_with_object({}) do |a3, result| result[a3] = func_f( a3) end
end

Note func_c is now only evaluated once again, It’s testable it, it’s reusable in the vanilla “everything is just Enumerable” sense.