<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Bcomposes</title>
	<atom:link href="https://bcomposes.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>https://bcomposes.wordpress.com</link>
	<description>Jason Baldridge&#039;s blog: computational lingusitics, programming (Scala, Java, Python, R), and random acts of skepticism</description>
	<lastBuildDate>Thu, 23 Feb 2012 06:11:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='bcomposes.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>https://secure.gravatar.com/blavatar/36151e3a580f62a4de82c871a4b05f56?s=96&#038;d=https%3A%2F%2Fs-ssl.wordpress.com%2Fi%2Fbuttonw-com.png</url>
		<title>Bcomposes</title>
		<link>https://bcomposes.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="https://bcomposes.wordpress.com/osd.xml" title="Bcomposes" />
	<atom:link rel='hub' href='https://bcomposes.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Student Questions about Scala, Part 1</title>
		<link>https://bcomposes.wordpress.com/2012/02/23/student-questions-about-scala-part-1/</link>
		<comments>https://bcomposes.wordpress.com/2012/02/23/student-questions-about-scala-part-1/#comments</comments>
		<pubDate>Thu, 23 Feb 2012 05:58:19 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=365</guid>
		<description><![CDATA[Topics: conventions, regexes, mapping, partitioning, vectors vs lists, overloaded constructors, case classes, traits, multiple inheritance, implicits Preface I&#8217;m currently teaching a course on Applied Text Analysis and am using Scala as the programming language taught and used in the course. Rather than creating more tutorials, I figured I&#8217;d take a page from Brian Dunning&#8217;s playbook &#8230;<p><a href="https://bcomposes.wordpress.com/2012/02/23/student-questions-about-scala-part-1/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=365&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>conventions, regexes, mapping, partitioning, vectors vs lists, overloaded constructors, case classes, traits, multiple inheritance, implicits</em></p>
<h2>Preface</h2>
<p>I&#8217;m currently teaching a course on <a href="http://ata-s12.utcompling.com/">Applied Text Analysis </a>and am using Scala as the programming language taught and used in the course. Rather than creating more tutorials, I figured I&#8217;d take a page from Brian Dunning&#8217;s playbook on his <a href="http://skeptoid.com/">Skeptoid</a> podcast (highly recommended) when he takes student questions.  So, I had the students in the course submit questions about Scala that they had, based on the readings and assignments thus far. This post covers over half of them &#8212; the rest will be covered in a follow up post.</p>
<p>I start with some of the more basic questions, and the questions and/or answers progressively get into more intermediate level topics. Suggestions and comments to improve any of the answers are very welcome!</p>
<h2>Basic Questions</h2>
<p><em><span style="color:#0000ff;">Q. Concerning addressing parts of variables: To address individual parts of lists, the numbering of the items is (List 0,1,2 etc.) That is, the first element is called &#8220;0&#8243;. It seems to be the same for Arrays and Maps, but not for Tuples- to get the first element of a Tuple, I need to use Tuple._1. Why is that?</span></em></p>
<p>A. It&#8217;s just a matter of convention &#8212; tuples have used a 1-based index in other languages like Haskell, and it seems that Scala has adopted the same convention/tradition. See:</p>
<p>http://stackoverflow.com/questions/6241464/why-are-the-indexes-of-scala-tuples-1-based</p>
<p><em><span style="color:#0000ff;">Q. It seems that Scala doesn&#8217;t recognize the &#8220;b&#8221; boundary character as a regular expression.  Is there something similar in Scala?</span></em></p>
<p>A. Scala does recognize boundary characters. For example, the following REPL session declares a regex that finds &#8220;the&#8221; with boundaries, and successfully retrieves the three tokens of &#8220;the&#8221; in the example sentence.</p>
<p><pre class="brush: scala;">
scala&gt; val TheRE = &quot;&quot;&quot;\bthe\b&quot;&quot;&quot;.r
TheRE: scala.util.matching.Regex = \bthe\b

scala&gt; val sentence = &quot;She think the man is a stick-in-the-mud, but the man disagrees.&quot;
sentence: java.lang.String = She think the man is a stick-in-the-mud, but the man disagrees.

scala&gt; TheRE.findAllIn(sentence).toList
res1: List[String] = List(the, the, the)
</pre></p>
<p><em><span style="color:#0000ff;">Q. Why doesn&#8217;t the method &#8220;split&#8221; work on args? Example: val arg = args.split(&#8221; &#8220;). Args are strings right, so split should work?</span></em></p>
<p>A. The <strong>args</strong> variable is an Array, so split doesn&#8217;t work on them. Arrays are, in effect, already split.</p>
<p><em><span style="color:#0000ff;">Q. What is the major difference between <strong>foo.mapValues(x=&gt;x.length)</strong> and <strong>foo.map(x=&gt;x.length)</strong>. Some places one works and one does not.</span></em></p>
<p>A. The <strong>map</strong> function works on all sequence types, including Seqs and Maps (note that Maps can be seen as sequences of Tuple2s). The <strong>mapValues</strong> function, however, only works on Maps. It is essentially a convenience function. As an example, let&#8217;s start with a simple Map from Ints to Ints.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List((1,2),(3,4)).toMap
foo: scala.collection.immutable.Map[Int,Int] = Map(1 -&gt; 2, 3 -&gt; 4)
</pre></p>
<p>Now consider the task of adding 2 to each value in the Map. This can be done with the <strong>map</strong> function as follows.</p>
<p><pre class="brush: scala;">
scala&gt; foo.map { case(key,value) =&gt; (key,value+2) }
res5: scala.collection.immutable.Map[Int,Int] = Map(1 -&gt; 4, 3 -&gt; 6)
</pre></p>
<p>So, the map function iterates over key/value pairs. We need to match both of them, and then output the key and the changed value to create the new Map. The <strong>mapValues</strong> function makes this quite a bit easier.</p>
<p><pre class="brush: scala;">
scala&gt; foo.mapValues(2+)
res6: scala.collection.immutable.Map[Int,Int] = Map(1 -&gt; 4, 3 -&gt; 6)
</pre></p>
<p>Returning to the question about computing the length using <strong>mapValues</strong> or <strong>map</strong> &#8212; then it is just a question of which values you are transforming, as in the following examples.</p>
<p><pre class="brush: scala;">
scala&gt; val sentence = &quot;here is a sentence with some words&quot;.split(&quot; &quot;).toList
sentence: List[java.lang.String] = List(here, is, a, sentence, with, some, words)

scala&gt; sentence.map(_.length)
res7: List[Int] = List(4, 2, 1, 8, 4, 4, 5)

scala&gt; val firstCharTokens = sentence.groupBy(x=&gt;x(0))
firstCharTokens: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map(s -&gt; List(sentence, some), a -&gt; List(a), i -&gt; List(is), h -&gt; List(here), w -&gt; List(with, words))

scala&gt; firstCharTokens.mapValues(_.length)
res9: scala.collection.immutable.Map[Char,Int] = Map(s -&gt; 2, a -&gt; 1, i -&gt; 1, h -&gt; 1, w -&gt; 2)
</pre></p>
<p><em><span style="color:#0000ff;">Q. Is there any function that splits a list into two lists with the elements in the alternating positions of the original list? For example,</span></em></p>
<p><strong><em><span style="color:#0000ff;">MainList =(1,2,3,4,5,6)</span></em></strong></p>
<p><strong><em><span style="color:#0000ff;">List1 = (1,3,5)</span></em></strong><br />
<strong> <em><span style="color:#0000ff;"> List2 = (2,4,6)</span></em></strong></p>
<p>A. Given the exact main list you provided, one can use the partition function and use the modulo operation to see whether the value is divisible evenly by 2 or not.</p>
<p><pre class="brush: scala;">
scala&gt; val mainList = List(1,2,3,4,5,6)
mainList: List[Int] = List(1, 2, 3, 4, 5, 6)

scala&gt; mainList.partition(_ % 2 == 0)
res0: (List[Int], List[Int]) = (List(2, 4, 6),List(1, 3, 5))
</pre></p>
<p>So, partition returns a pair of Lists. The first has all the elements that match the condition and the second has all the ones that do not.</p>
<p>Of course, this wouldn&#8217;t work in general for Lists that have Strings, or that don&#8217;t have Ints in order, etc. However, the <em>indices</em> of a List are always well-behaved in this way, so we just need to do a bit more work by zipping each element with its index and then partitioning based on indices.</p>
<p><pre class="brush: scala;">
scala&gt; val unordered = List(&quot;b&quot;,&quot;2&quot;,&quot;a&quot;,&quot;4&quot;,&quot;z&quot;,&quot;8&quot;)
unordered: List[java.lang.String] = List(b, 2, a, 4, z, 8)

scala&gt; unordered.zipWithIndex
res1: List[(java.lang.String, Int)] = List((b,0), (2,1), (a,2), (4,3), (z,4), (8,5))

scala&gt; val (evens, odds) = unordered.zipWithIndex.partition(_._2 % 2 == 0)
evens: List[(java.lang.String, Int)] = List((b,0), (a,2), (z,4))
odds: List[(java.lang.String, Int)] = List((2,1), (4,3), (8,5))

scala&gt; evens.map(_._1)
res2: List[java.lang.String] = List(b, a, z)

scala&gt; odds.map(_._1)
res3: List[java.lang.String] = List(2, 4, 8)
</pre></p>
<p>Based on this, you could of course write a function that does this for any arbitrary list.</p>
<p><em><span style="color:#0000ff;">Q. How to convert a List to a Vector and vice-versa?</span></em></p>
<p>A. Use <strong>toIndexSeq</strong> and <strong>toList</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(1,2,3,4)
foo: List[Int] = List(1, 2, 3, 4)

scala&gt; val bar = foo.toIndexedSeq
bar: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3, 4)

scala&gt; val baz = bar.toList
baz: List[Int] = List(1, 2, 3, 4)

scala&gt; foo == baz
res0: Boolean = true
</pre></p>
<p><em><span style="color:#0000ff;">Q. The advantage of a vector over a list is the constant time look-up. What is the advantage of using a list over a vector?</span></em></p>
<p>A. A List is slightly faster for operations at the head (front) of the sequence, so if all you are doing is doing a traversal (accessing each element in order, e.g. when mapping), then Lists are perfectly adequate and may be more efficient. They also have some nice pattern matching behavior for case statements.</p>
<p>However, common wisdom seems to be that you should default to using Vectors. See Daniel Spiewak&#8217;s nice answer on Stackoverflow:</p>
<p>http://stackoverflow.com/questions/6928327/when-should-i-choose-vector-in-scala</p>
<p><span style="color:#0000ff;">Q. With splitting strings, holmes.split(&#8220;\\s&#8221;) &#8211; \n and \t just requires a single &#8216;\&#8217; to recognize its special functionality but why two &#8216;\&#8217;s are required for white space character?</span></p>
<p>A. That&#8217;s because \n and \t actually mean something in a String.</p>
<p><pre class="brush: scala;">
scala&gt; println(&quot;Here is a line with a tab\tor\ttwo, followed by\na new line.&quot;)
Here is a line with a tab    or    two, followed by
a new line.

scala&gt; println(&quot;This will break\s.&quot;)
&lt;console&gt;:1: error: invalid escape character
println(&quot;This will break\s.&quot;)
</pre></p>
<p>So, you are supplying a String argument to split, and it uses that to construct a regular expression. Given that \s is not a string character, but is a regex metacharacter, you need to escape it. You can of course use <strong>split(&#8220;&#8221;"\s&#8221;"&#8221;)</strong>, though that isn&#8217;t exactly better in this case.</p>
<p><em><span style="color:#0000ff;">Q. I have long been programming in C++ and Java. Therefore, I put semicolon at the end of the line unconsciously. It seems that the standard coding style of Scala doesn&#8217;t recommend to use semicolons. However, I saw that there are some cases that require semicolons as you showed last class. Is there any specific reason why semicolon loses its role in Scala?</span></em></p>
<p>A. The main reason is to improve readability since the semicolon is rarely needed when writing standard code in editors (as opposed to one liners in the REPL). However, when you want to do something in a single line, like handling multiple cases, you need the semicolons.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(&quot;a&quot;,1,&quot;b&quot;,2)
foo: List[Any] = List(a, 1, b, 2)

scala&gt; foo.map { case(x: String) =&gt; x; case(x: Int) =&gt; x.toString }
res5: List[String] = List(a, 1, b, 2)
</pre></p>
<p>But, in general, it&#8217;s best to just split these cases over multiple lines in any actual code.</p>
<p><em><span style="color:#0000ff;">Q. Is there no way to use _ in map like methods for collections that consist of pairs? For example, <strong>List((1,1),(2,2)).map(e =&gt; e._1 + e._2)</strong> works, but <strong>List((1,1),(2,2)).map(_._1 + _._2)</strong> does not work.</span></em></p>
<p>A. The scope in which the _ remains unanambigious runs out past its first invocation, so you only get to use it once. It is better anyway to use a case statement that makes it clear what the members of the pairs are.</p>
<p><pre class="brush: scala;">
scala&gt;  List((1,1),(2,2)).map { case(num1, num2) =&gt; num1+num2 }
res6: List[Int] = List(2, 4)
</pre></p>
<p><em><span style="color:#0000ff;">Q. I am unsure about the exact meaning of and the difference between &#8220;=&gt;&#8221; and &#8220;-&gt;&#8221;. They both seem to mean something like &#8220;apply X to Y&#8221; and I see that each is used in a particular context, but what is the logic behind that?</span></em></p>
<p>A. The use of -&gt; simply constructs a Tuple2, as is pretty clear in the following snippet.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = (1,2)
foo: (Int, Int) = (1,2)

scala&gt; val bar = 1-&gt;2
bar: (Int, Int) = (1,2)

scala&gt; foo == bar
res11: Boolean = true
</pre></p>
<p>Primarily, it is syntactic sugar that provides an intuitive symbol for creating elements of a a Map. Compare the following two ways of declaring the same Map.</p>
<p><pre class="brush: scala;">
scala&gt; Map((&quot;a&quot;,1),(&quot;b&quot;,2))
res9: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -&gt; 1, b -&gt; 2)

scala&gt; Map(&quot;a&quot;-&gt;1,&quot;b&quot;-&gt;2)
res10: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -&gt; 1, b -&gt; 2)
</pre></p>
<p>The second seems more readable to me.</p>
<p>The use of =&gt; indicates that you are defining a function. The basic form is <em>ARGUMENTS =&gt; RESULT.</em></p>
<p><pre class="brush: scala;">
scala&gt; val addOne = (x: Int) =&gt; x+1
addOne: Int =&gt; Int = &lt;function1&gt;

scala&gt; addOne(2)
res7: Int = 3

scala&gt; val addTwoNumbers = (num1: Int, num2: Int) =&gt; num1+num2
addTwoNumbers: (Int, Int) =&gt; Int = &lt;function2&gt;

scala&gt; addTwoNumbers(3,5)
res8: Int = 8
</pre></p>
<p>Normally, you use it in defining anonymous functions as arguments to functions like <strong>map</strong>, <strong>filter</strong>, and such.</p>
<p><em><span style="color:#0000ff;">Q. Is there a more convenient way of expressing vowels as [AEIOUaeiou] and consonants as [BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz] in RegExes?</span></em></p>
<p>A. You can use Strings when defining regexes, so you can have a variable for vowels and one for consonants.</p>
<p><pre class="brush: scala;">
scala&gt; val vowel = &quot;[AEIOUaeiou]&quot;
vowel: java.lang.String = [AEIOUaeiou]

scala&gt; val consonant = &quot;[BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz]&quot;
consonant: java.lang.String = [BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz]

scala&gt; val MyRE = (&quot;(&quot;+vowel+&quot;)(&quot;+consonant+&quot;)(&quot;+vowel+&quot;)&quot;).r
MyRE: scala.util.matching.Regex = ([AEIOUaeiou])([BCDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz])([AEIOUaeiou])

scala&gt; val MyRE(x,y,z) = &quot;aJE&quot;
x: String = a
y: String = J
z: String = E
</pre></p>
<p><em><span style="color:#0000ff;">Q. The &#8220;\b&#8221; in RegExes marks a boundary, right? So, it also captures the &#8220;-&#8221;. But if I have a single string &#8220;sdnfeorgn&#8221;, it does NOT capture the boundaries of that, is that correct? And if so, why doesn&#8217;t it?</span></em></p>
<p>A. Because there are no boundaries in that string!</p>
<h2>Intermediate questions</h2>
<p><em><span style="color:#0000ff;">Q. The flatMap function takes lists of lists and merges them to single list. But in the example:</span></em></p>
<p><pre class="brush: scala;">
scala&gt; (1 to 10).toList.map(x=&gt;squareOddNumber(x))
res16: List[Option[Int]] = List(Some(1), None, Some(9), None, Some(25), None, Some(49), None, Some(81), None)

scala&gt; (1 to 10).toList.flatMap(x=&gt;squareOddNumber(x))
res17: List[Int] = List(1, 9, 25, 49, 81)
</pre></p>
<p><em><span style="color:#0000ff;">Here it is not list of list but just a list. In this case it expects the list to be Option list.</span></em><br />
<em><span style="color:#0000ff;"> I tried running the code with function returning just number or None. It showed error. So is there any way to use flatmap without Option lists and just list. For example, <strong>List(1, None, 9, None, 25)</strong> should be returned as <strong>List(1, 9, 25)</strong>.</span></em></p>
<p>A. No, this won&#8217;t work because <strong>List(1, None, 9, None, 25)</strong> mixes <strong>Options</strong> with <strong>Ints</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val mixedup = List(1, None, 9, None, 25)
mixedup: List[Any] = List(1, None, 9, None, 25)
</pre></p>
<p>So, you should have your function return an <strong>Option</strong> which means returning <strong>Somes</strong> or <strong>Nones</strong>. Then <strong>flatMap</strong> will work happily.</p>
<p>One way of think of Options is that they are like Lists with zero or one element, as can be noted by the parallels in the following snippet.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(List(1),Nil,List(3),List(6),Nil)
foo: List[List[Int]] = List(List(1), List(), List(3), List(6), List())

scala&gt; foo.flatten
res12: List[Int] = List(1, 3, 6)

scala&gt; val bar = List(Option(1),None,Option(3),Option(6),None)
bar: List[Option[Int]] = List(Some(1), None, Some(3), Some(6), None)

scala&gt; bar.flatten
res13: List[Int] = List(1, 3, 6)
</pre></p>
<p><em><span style="color:#0000ff;">Q. Does scala have generic templates (like C++, Java)? eg. in C++, we can use vector&lt;int&gt;, vector&lt;string&gt; etc. Is that possible in scala? If so, how?</span></em></p>
<p>A. Yes, every collection type is parameterized. Notice that each of the following variables is parameterized by the type of the elements they are initialized with.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(1,2,3)
foo: List[Int] = List(1, 2, 3)

scala&gt; val bar = List(&quot;a&quot;,&quot;b&quot;,&quot;c&quot;)
bar: List[java.lang.String] = List(a, b, c)

scala&gt; val baz = List(true, false, true)
baz: List[Boolean] = List(true, false, true)
</pre></p>
<p>You can create your own parameterized classes straightforwardly.</p>
<p><pre class="brush: scala;">
scala&gt; class Flexible[T] (val data: T)
defined class Flexible

scala&gt; val foo = new Flexible(1)
foo: Flexible[Int] = Flexible@7cd0570e

scala&gt; val bar = new Flexible(&quot;a&quot;)
bar: Flexible[java.lang.String] = Flexible@31b6956f

scala&gt; val baz = new Flexible(true)
baz: Flexible[Boolean] = Flexible@5b58539f

scala&gt; foo.data
res0: Int = 1

scala&gt; bar.data
res1: java.lang.String = a

scala&gt; baz.data
res2: Boolean = true
</pre></p>
<p><em><span style="color:#0000ff;">Q. How can we easily create, initialize and work with multi-dimensional arrays (and dictionaries)?</span></em></p>
<p>A. Use the <strong>fill</strong> function of the <strong>Array</strong> object to create them.</p>
<p><pre class="brush: scala;">
scala&gt; Array.fill(2)(1.0)
res8: Array[Double] = Array(1.0, 1.0)

scala&gt; Array.fill(2,3)(1.0)
res9: Array[Array[Double]] = Array(Array(1.0, 1.0, 1.0), Array(1.0, 1.0, 1.0))

scala&gt; Array.fill(2,3,2)(1.0)
res10: Array[Array[Array[Double]]] = Array(Array(Array(1.0, 1.0), Array(1.0, 1.0), Array(1.0, 1.0)), Array(Array(1.0, 1.0), Array(1.0, 1.0), Array(1.0, 1.0)))
</pre></p>
<p>Once you have these in hand, you can iterate over them as usual.</p>
<p><pre class="brush: scala;">
scala&gt; val my2d = Array.fill(2,3)(1.0)
my2d: Array[Array[Double]] = Array(Array(1.0, 1.0, 1.0), Array(1.0, 1.0, 1.0))

scala&gt; my2d.map(row =&gt; row.map(x=&gt;x+1))
res11: Array[Array[Double]] = Array(Array(2.0, 2.0, 2.0), Array(2.0, 2.0, 2.0))
</pre></p>
<p>For dictionaries (Maps), you can use mutable HashMaps to create an empty Map and then add elements to it. For that, see this blog post:</p>
<p>http://bcomposes.wordpress.com/2011/09/19/first-steps-in-scala-for-beginning-programmers-part-8/</p>
<p><em><span style="color:#0000ff;">Q. Is the <strong>apply</strong> function similar to constructor in C++, Java? Where will the <strong>apply</strong> function be practically used? Is it for intialising values of attributes?</span></em></p>
<p>A. No, the <strong>apply</strong> function is like any other function except that it allows you to call it without writing out &#8220;apply&#8221;. Consider the following class.</p>
<p><pre class="brush: scala;">
class AddX (x: Int) {
  def apply(y: Int) = x+y
  override def toString = &quot;My number is &quot; + x
}
</pre></p>
<p>Here&#8217;s how we can use it.</p>
<p><pre class="brush: scala;">
scala&gt; val add1 = new AddX(1)
add1: AddX = My number is 1

scala&gt; add1(4)
res0: Int = 5

scala&gt; add1.apply(4)
res1: Int = 5

scala&gt; add1.toString
res2: java.lang.String = My number is 1
</pre></p>
<p>So, the <strong>apply</strong> method is just (very handy) syntactic sugar that allows you to specify one function as fundamental to a class you have designed (actually, you can have multiple <strong>apply</strong> methods as long as each one has a unique parameter list). For example, with Lists, the <strong>apply</strong> method returns the value at the index provided, and for Maps it returns the value associated with the given key.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(1,2,3)
foo: List[Int] = List(1, 2, 3)

scala&gt; foo(2)
res3: Int = 3

scala&gt; foo.apply(2)
res4: Int = 3

scala&gt; val bar = Map(1-&gt;2,3-&gt;4)
bar: scala.collection.immutable.Map[Int,Int] = Map(1 -&gt; 2, 3 -&gt; 4)

scala&gt; bar(1)
res5: Int = 2

scala&gt; bar.apply(1)
res6: Int = 2
</pre></p>
<p><em><span style="color:#0000ff;">Q. In <a title="First steps in Scala for beginning programmers, Part 11" href="http://bcomposes.wordpress.com/2011/10/26/first-steps-in-scala-for-beginning-programmers-part-11/">the SBT tutorial</a> you discuss &#8220;Node&#8221; and &#8220;Value&#8221; as being case classes. What is the alternative to a case class?</span></em></p>
<p>A. A normal class. Case classes are the special case. They do two things (and more) for you. The first is that you don&#8217;t have to use &#8220;new&#8221; to create a new object. Consider the following otherwise identical classes.</p>
<p><pre class="brush: scala;">
scala&gt; class NotACaseClass (val data: Int)
defined class NotACaseClass

scala&gt; case class IsACaseClass (val data: Int)
defined class IsACaseClass

scala&gt; val foo = new NotACaseClass(4)
foo: NotACaseClass = NotACaseClass@a5c0f8f

scala&gt; val bar = IsACaseClass(4)
bar: IsACaseClass = IsACaseClass(4)
</pre></p>
<p>That may seem like a little thing, but it can significantly improve code readability. Consider creating Lists within Lists within Lists if you had to use &#8220;new&#8221; all the time, for example. This is definitely true for <strong>Node</strong> and <strong>Value</strong>, which are used to build trees.</p>
<p>Case classes also support matching, as in the following.</p>
<p><pre class="brush: scala;">
scala&gt; val IsACaseClass(x) = bar
x: Int = 4
</pre></p>
<p>A normal class cannot do this.</p>
<p><pre class="brush: scala;">
scala&gt; val NotACaseClass(x) = foo
&lt;console&gt;:13: error: not found: value NotACaseClass
val NotACaseClass(x) = foo
^
&lt;console&gt;:13: error: recursive value x needs type
val NotACaseClass(x) = foo
^
</pre></p>
<p>If you mix the case class into a List and map over it, you can match it like you can with other classes, like Lists and Ints. Consider the following heterogeneous List.</p>
<p><pre class="brush: scala;">
scala&gt; val stuff = List(IsACaseClass(3), List(2,3), IsACaseClass(5), 4)
stuff: List[Any] = List(IsACaseClass(3), List(2, 3), IsACaseClass(5), 4)
</pre></p>
<p>We can convert this to a List of Ints by processing each element according to its type by matching.</p>
<p><pre class="brush: scala;">
scala&gt; stuff.map { case List(x,y) =&gt; x; case IsACaseClass(x) =&gt; x; case x: Int =&gt; x }
&lt;console&gt;:13: warning: match is not exhaustive!
missing combination              *           Nil             *             *

stuff.map { case List(x,y) =&gt; x; case IsACaseClass(x) =&gt; x; case x: Int =&gt; x }
^

warning: there were 1 unchecked warnings; re-run with -unchecked for details
res10: List[Any] = List(3, 2, 5, 4)
</pre></p>
<p>If you don&#8217;t want to see the warning in the REPL, add a case for things that don&#8217;t match that throws a MatchError.</p>
<p><pre class="brush: scala;">
scala&gt; stuff.map { case List(x,y) =&gt; x; case IsACaseClass(x) =&gt; x; case x: Int =&gt; x; case _ =&gt; throw new MatchError }
warning: there were 1 unchecked warnings; re-run with -unchecked for details
res13: List[Any] = List(3, 2, 5, 4)
</pre></p>
<p>Better yet, return Options (using None for the unmatched case) and flatMapping instead.</p>
<p><pre class="brush: scala;">
scala&gt; stuff.flatMap { case List(x,y) =&gt; Some(x); case IsACaseClass(x) =&gt; Some(x); case x: Int =&gt; Some(x); case _ =&gt; None }
warning: there were 1 unchecked warnings; re-run with -unchecked for details
res14: List[Any] = List(3, 2, 5, 4)
</pre></p>
<p><em><span style="color:#0000ff;">Q. In C++ the default access specifier is private; in Java one needs to specify private or public for each class member where as in Scala the default access specifier for a class is public. What could be the design motivation behind this when one of the purpose of the class is data hiding?</span></em></p>
<p>A. The reason is that Scala has a much more refined access specification scheme than Java that makes <strong>public</strong> the rational choice. See the discussion here:</p>
<p>http://stackoverflow.com/questions/4656698/default-public-access-in-scala</p>
<p>Another key aspecte of this is that the general emphasis in Scala is on using immutable data structures, so there isn&#8217;t any danger of someone changing the internal state of your objects if you have designed them in this way. This in turn gets rid of the ridiculous getter and setter methods that breed and multiply in Java programs. See &#8220;Why getters and setters are evil&#8221; for more discussion:</p>
<p>http://www.javaworld.com/javaworld/jw-09-2003/jw-0905-toolbox.html</p>
<p>After you get used to programming in Scala, the whole getter/setter thing that is so common in Java code is pretty much gag worthy.</p>
<p>In general, it is still a good idea to use <strong>private[this]</strong> as a modifier to methods and variables whenever they are only needed by an object itself.</p>
<p><em><span style="color:#0000ff;">Q. How do we define overloaded constructors in Scala?</span></em></p>
<p><em><span style="color:#0000ff;">Q. The way a class is defined in Scala introduced in the tutorial, seems to have only one constructor. Is there any way to provide multiple constructors like Java?</span></em></p>
<p>A. You can add additional constructors with <strong>this</strong> declarations.</p>
<p><pre class="brush: scala;">
class SimpleTriple (x: Int, y: Int, z: String) {
  def this (x: Int, z: String) = this(x,0,z)
  def this (x: Int, y: Int) = this(x,y,&quot;a&quot;)
  override def toString = x + &quot;:&quot; + y + &quot;:&quot; + z
}

scala&gt; val foo = new SimpleTriple(1,2,&quot;hello&quot;)
foo: SimpleTriple = 1:2:hello

scala&gt; val bar = new SimpleTriple(1,&quot;goodbye&quot;)
bar: SimpleTriple = 1:0:goodbye

scala&gt; val baz = new SimpleTriple(1,3)
baz: SimpleTriple = 1:3:a
</pre></p>
<p>Notice that you must supply an initial value for every one of the parameters of the class. This contrasts with Java, which allows you to leave some fields uninitialized (and which tends to lead to nasty bugs and bad design).</p>
<p>Note that you can also provide defaults to parameters.</p>
<p><pre class="brush: scala;">
class SimpleTripleWithDefaults (x: Int, y: Int = 0, z: String = &quot;a&quot;) {
  override def toString = x + &quot;:&quot; + y + &quot;:&quot; + z
}

scala&gt; val foo = new SimpleTripleWithDefaults(1)
foo: SimpleTripleWithDefaults = 1:0:a

scala&gt; val bar = new SimpleTripleWithDefaults(1,2)
bar: SimpleTripleWithDefaults = 1:2:a
</pre></p>
<p>However, you can&#8217;t omit a middle parameter while specifying the last one.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = new SimpleTripleWithDefaults(1,&quot;xyz&quot;)
&lt;console&gt;:12: error: type mismatch;
found   : java.lang.String(&quot;xyz&quot;)
required: Int
Error occurred in an application involving default arguments.
val foo = new SimpleTripleWithDefaults(1,&quot;xyz&quot;)
^
</pre></p>
<p>But, you can name the parameters in the initialization if you want to be able to do this.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = new SimpleTripleWithDefaults(1,z=&quot;xyz&quot;)
foo: SimpleTripleWithDefaults = 1:0:xyz
</pre></p>
<p>You then have complete freedom to change the parameters around.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = new SimpleTripleWithDefaults(z=&quot;xyz&quot;,x=42,y=3)
foo: SimpleTripleWithDefaults = 42:3:xyz
</pre></p>
<p><em><span style="color:#0000ff;">Q. I&#8217;m still not clear on the difference between classes and traits.  I guess I see a conceptual difference but I don&#8217;t really understand what the functional difference is &#8212; how is creating a &#8220;trait&#8221; different from creating a class with maybe fewer methods associated with it?</span></em></p>
<p>A. Yes, they are different. First off, traits are abstract, which means you cannot create any members. Consider the following contrast.</p>
<p><pre class="brush: scala;">
scala&gt; class FooClass
defined class FooClass

scala&gt; trait FooTrait
defined trait FooTrait

scala&gt; val fclass = new FooClass
fclass: FooClass = FooClass@1b499616

scala&gt; val ftrait = new FooTrait
&lt;console&gt;:8: error: trait FooTrait is abstract; cannot be instantiated
val ftrait = new FooTrait
^
</pre></p>
<p>You can extend a trait to make a concrete class, however.</p>
<p><pre class="brush: scala;">
scala&gt; class FooTraitExtender extends FooTrait
defined class FooTraitExtender

scala&gt; val ftraitExtender = new FooTraitExtender
ftraitExtender: FooTraitExtender = FooTraitExtender@53d26552
</pre></p>
<p>This gets more interesting if the trait has some methods, of course. Here&#8217;s a trait, <strong>Animal</strong>, that declares two abstract methods, <strong>makeNoise</strong> and <strong>doBehavior</strong>.</p>
<p><pre class="brush: scala;">
trait Animal {
  def makeNoise: String
  def doBehavior (other: Animal): String
}
</pre></p>
<p>We can extend this trait with new class definitions; each extending class must implement both of these methods (or else be declared abstract).</p>
<p><pre class="brush: scala;">
case class Bear (name: String, defaultBehavior: String = &quot;Regard warily...&quot;) extends Animal {
  def makeNoise = &quot;ROAR!&quot;
  def doBehavior (other: Animal) = other match {
    case b: Bear =&gt; makeNoise + &quot; I'm &quot; + name + &quot;.&quot;
    case m: Mouse =&gt; &quot;Eat it!&quot;
    case _ =&gt; defaultBehavior
  }
  override def toString = name
}

case class Mouse (name: String) extends Animal {
  def makeNoise = &quot;Squeak?&quot;
  def doBehavior (other: Animal) = other match {
    case b: Bear =&gt; &quot;Run!!!&quot;
    case m: Mouse =&gt; makeNoise + &quot; I'm &quot; + name + &quot;.&quot;
    case _ =&gt; &quot;Hide!&quot;
  }
  override def toString = name
}
</pre></p>
<p>Notice that Bear and Mouse have different parameter lists, but both can be Animals because they fully implement the Animal trait. We can now start creating objects of the Bear and Mouse classes and have them interact. We don&#8217;t need to use &#8220;new&#8221; because they are case classes (and this also allowed them to be used in the match statements of the <strong>doBehavior</strong> methods).</p>
<p><pre class="brush: scala;">
val yogi = Bear(&quot;Yogi&quot;, &quot;Hello!&quot;)
val baloo = Bear(&quot;Baloo&quot;, &quot;Yawn...&quot;)
val grizzly = Bear(&quot;Grizzly&quot;)
val stuart = Mouse(&quot;Stuart&quot;)

println(yogi + &quot;: &quot; + yogi.makeNoise)
println(stuart + &quot;: &quot; + stuart.makeNoise)
println(&quot;Grizzly to Stuart: &quot; + grizzly.doBehavior(stuart))
</pre></p>
<p>We can also create a singleton object that is of the <strong>Animal</strong> type by using the following declaration.</p>
<p><pre class="brush: scala;">
object John extends Animal {
  def makeNoise = &quot;Hullo!&quot;
  def doBehavior (other: Animal) = other match {
    case b: Bear =&gt; &quot;Nice bear... nice bear...&quot;
    case _ =&gt; makeNoise
  }
  override def toString = &quot;John&quot;
}
</pre></p>
<p>Here, <strong>John</strong> is an object, not a class. Because this object implements the <strong>Animal</strong> trait, it successfully extends it and can act as an <strong>Animal</strong>. This means that a <strong>Bear</strong> like <strong>baloo</strong> can interact with <strong>John</strong>.</p>
<p><pre class="brush: scala;">
println(&quot;Baloo to John: &quot; + baloo.doBehavior(John))
</pre></p>
<p>The output of the above code when run as a script is the following.</p>
<p style="padding-left:30px;">Yogi: ROAR!<br />
Stuart: Squeak?<br />
Grizzly to Stuart: Eat it!<br />
Baloo to John: Yawn&#8230;</p>
<p>The closer distinction is between traits and abstract classes. In fact, everything shown above could have been done with <strong>Animal</strong> as an abstract class rather than as a trait. One difference is that an abstract class can have a constructor while traits cannot. Another key difference between them is that traits can be used to support limited multiple inheritance, as shown in the next question/answer.</p>
<p><em><span style="color:#0000ff;">Q. Does Scala support multiple inheritance?</span></em></p>
<p>A. Yes, via traits with implementations of some methods. Here&#8217;s an example, with a trait <strong>Clickable</strong> that has an abstract (unimplemented) method <strong>getMessage</strong>, an implemented method <strong>click</strong>, and a private, reassignable variable <strong>numTimesClicked</strong> (the latter two show clearly that traits are different from Java interfaces).</p>
<p><pre class="brush: scala;">
trait Clickable {
  private var numTimesClicked = 0
  def getMessage: String
  def click = {
    val output = numTimesClicked + &quot;: &quot; + getMessage
    numTimesClicked += 1
    output
  }
}
</pre></p>
<p>Now let&#8217;s say we have a <strong>MessageBearer</strong> class (that we may have wanted for entirely different reasons having nothing to do with clicking).</p>
<p><pre class="brush: scala;">
class MessageBearer (val message: String) {
  override def toString = message
}
</pre></p>
<p>A new class can be now created by extending <strong>MessageBearer</strong> and &#8220;mixing in&#8221; the <strong>Clickable</strong> trait.</p>
<p><pre class="brush: scala;">
class ClickableMessageBearer(message: String) extends MessageBearer(message) with Clickable {
  def getMessage = message
}
</pre></p>
<p><strong>ClickableMessageBearer</strong> now has the abilities of both <strong>MessageBearers</strong> (which is to be able to retrieve its message) and <strong>Clickables</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val cmb1 = new ClickableMessageBearer(&quot;I'm number one!&quot;)
cmb1: ClickableMessageBearer = I'm number one!

scala&gt; val cmb2 = new ClickableMessageBearer(&quot;I'm number two!&quot;)
cmb2: ClickableMessageBearer = I'm number two!

scala&gt; cmb1.click
res3: java.lang.String = 0: I'm number one!

scala&gt; cmb1.message
res4: String = I'm number one!

scala&gt; cmb1.click
res5: java.lang.String = 1: I'm number one!

scala&gt; cmb2.click
res6: java.lang.String = 0: I'm number two!

scala&gt; cmb1.click
res7: java.lang.String = 2: I'm number one!

scala&gt; cmb2.click
res8: java.lang.String = 1: I'm number two!
</pre></p>
<p><em><span style="color:#0000ff;">Q. Why are there <strong>toString</strong>, <strong>toInt</strong>, and <strong>toList</strong> functions, but there isn&#8217;t a <strong>toTuple</strong> function?</span></em></p>
<p>A. This is a basic question that leads directly to the more advanced topic of <em>implicits</em>. There are a number of reasons behind this. To start with, it is important to realize that there are many types of Tuples, starting with a Tuple with a single element (a Tuple1) up to 22 elements (a Tuple22). Note that when you use <strong>(,)</strong> to create a tuple, it is implicitly invoking a constructor for the corresponding TupleN of the correct arity.</p>
<p><pre class="brush: scala;">
scala&gt; val b = (1,2,3)
b: (Int, Int, Int) = (1,2,3)

scala&gt; val c = Tuple3(1,2,3)
c: (Int, Int, Int) = (1,2,3)

scala&gt; b==c
res4: Boolean = true
</pre></p>
<p>Given this, it is obviously not meaningful to have a function <strong>toTuple</strong> on Seqs (sequences) that are longer than 22. This means there is no generic way to have, say a List or Array, and then call <strong>toTuple</strong> on it and expect reliable behavior to happen.</p>
<p>However, if you want this functionality (even though limited by the above constraint of 22 elements max), Scala allows you to &#8220;add&#8221; methods to existing classes by using implicit definitions. You can find lots of discussions about implicits by search for &#8220;scala implicits&#8221;. But, here&#8217;s an example that shows how it works for this particular case.</p>
<p><pre class="brush: scala;">
val foo = List(1,2)
val bar = List(3,4,5)
val baz = List(6,7,8,9)

foo.toTuple

class TupleAble[X] (elements: Seq[X]) {
  def toTuple = elements match {
    case Seq(a) =&gt; Tuple1(a)
    case Seq(a,b) =&gt; (a,b)
    case Seq(a,b,c) =&gt; (a,b,c)
    case _ =&gt; throw new RuntimeException(&quot;Sequence too long to be handled by toTuple: &quot; + elements)
  }
}

foo.toTuple

implicit def seqToTuple[X](x: Seq[X]) = new TupleAble(x)

foo.toTuple
bar.toTuple
baz.toTuple
</pre></p>
<p>If you put this into the Scala REPL, you&#8217;ll see that the first invocation of <strong>foo.toTuple</strong> gets an error:</p>
<p><pre class="brush: scala;">
scala&gt; foo.toTuple
&lt;console&gt;:9: error: value toTuple is not a member of List[Int]
foo.toTuple
^
</pre></p>
<p>Note that class <strong>TupleAble</strong> takes a Seq in its constructor and then provides the method <strong>toTuple</strong>, using that Seq. It is able to do so for Seqs with 1, 2 or 3 elements, and above that it throws an exception. (We could of course keeping listing more cases out and go up to 22 element tuples, but this shows the point.)</p>
<p>The second invocation of <strong>foo.toTuple</strong> still doesn&#8217;t work &#8212; and that is because <strong>foo</strong> is a List (a kind of Seq) and there isn&#8217;t a <strong>toTuple</strong> method for Lists. That&#8217;s where the implicit function <strong>seqToTuple</strong> comes in &#8212; once it is declared, Scala notes that you are trying to call <strong>toTuple</strong> on a Seq, notes that there is no such function for Seqs, but sees that there is an implicit conversion from Seqs to TupleAbles via <strong>seqToTuple</strong>, and then it sees that TupleAble has a <strong>toTuple</strong> method. Based on that, it compiles and the produces the desired behavior. This is a very handy ability of Scala that can really simplify your code if you use it well and with care.</p>
<p><span style="color:#888888;">Copyright 2012 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this post is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original post.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/365/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/365/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=365&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2012/02/23/student-questions-about-scala-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>Variations for computing results from sequences in Scala</title>
		<link>https://bcomposes.wordpress.com/2012/02/14/variations-for-computing-results-from-sequences-in-scala/</link>
		<comments>https://bcomposes.wordpress.com/2012/02/14/variations-for-computing-results-from-sequences-in-scala/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 21:06:35 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=346</guid>
		<description><![CDATA[Topics: iteration, mapping, for expressions, foreach loops, Lists, ListBuffers, Arrays, indexed sequences, recursion Introduction A common question from students who are new to Scala is: What is the difference between using the map function on lists, using for expressions and foreach loops? One of the major sources of confusion with regard to this question is &#8230;<p><a href="https://bcomposes.wordpress.com/2012/02/14/variations-for-computing-results-from-sequences-in-scala/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=346&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>iteration, mapping, for expressions, foreach loops, Lists, ListBuffers, Arrays, indexed sequences, recursion</em></p>
<h2>Introduction</h2>
<p>A common question from students who are new to Scala is: What is the difference between using the map function on lists, using for expressions and foreach loops? One of the major sources of confusion with regard to this question is that a <em>for expression</em> in Scala in not the equivalent of <em>for loops</em> in languages like Python and Java &#8212; instead, the equivalent of for loops is <em>foreach</em> in Scala. This distinction highlights the importance of understanding what it means to return values versus relying on side-effects to perform certain computations. It also helps reinforce some points about fixed versus reassignable variables and immutable versus mutable data structures.</p>
<h2>The task and its functional solution</h2>
<p>To demonstrate this, let&#8217;s consider a simple task. Given a <strong>List</strong> of words, compute two lists: one has the lengths of each word and the second indicates whether a word starts with a capital letter or not. For example, start with the following list.</p>
<p><pre class="brush: scala;">
scala&gt; val words = List(&quot;This&quot;, &quot;is&quot;, &quot;a&quot;, &quot;list&quot;, &quot;of&quot;, &quot;English&quot;, &quot;words&quot;, &quot;.&quot;)
words: List[java.lang.String] = List(This, is, a, list, of, English, words, .)
</pre></p>
<p>We can compute the two lists by mapping over the words list as follows.</p>
<p><pre class="brush: scala;">
scala&gt; words.map(_.length)
res0: List[Int] = List(4, 2, 1, 4, 2, 7, 5, 1)

scala&gt; words.map(_(0).isUpper)
res1: List[Boolean] = List(true, false, false, false, false, true, false, false)
</pre></p>
<p>So, that&#8217;s it. However, let&#8217;s do this without using different calls to the <strong>map</strong> function (or multiple <strong>foreach</strong> loops, as we&#8217;ll see below). The easiest way to do this is to map each word to a tuple containing the length and the <strong>Boolean</strong> indicating whether its first character is capitalized; this produces a list of tuples, which we unzip to get a tuple of Lists.</p>
<p><pre class="brush: scala;">
scala&gt; val (wlengthsMapUnzip, wcapsMapUnzip) =
|   words.map(word =&gt; (word.length, word(0).isUpper)).unzip
wlengthsMapUnzip: List[Int] = List(4, 2, 1, 4, 2, 7, 5, 1)
wcapsMapUnzip: List[Boolean] = List(true, false, false, false, false, true, false, false)
</pre></p>
<p>The key thing here is that the map function turns the <strong>List[String</strong>] words into a <strong>List[(Int, Boolean)]</strong> &#8212; which is to say it returns a value. We can assign that value to a variable, or use it immediately by calling <strong>unzip</strong> on it, which in turn returns a value that is a <strong>Tuple2(List[Int],List[Boolean])</strong>.</p>
<p>Before moving on let&#8217;s define a simple function to display the results of performing this computation, which we will do in various ways (and which all produce precisely the same results).</p>
<p><pre class="brush: scala;">
def display (intro: String, wlengths: List[Int], wcaps: List[Boolean]) {
  println(intro)
  println(&quot;Lengths: &quot; + wlengths.mkString(&quot; &quot;))
  println(&quot;Caps: &quot; + wcaps.mkString(&quot; &quot;))
  println
}
</pre></p>
<p>Calling this function with the result of mapping and unzipping as above, we get the following output.</p>
<p><pre class="brush: scala;">
scala&gt; display(&quot;Using map and unzip.&quot;, wlengthsMapUnzip, wcapsMapUnzip)
Using map and unzip.
Lengths: 4 2 1 4 2 7 5 1
Caps: true false false false false true false false
</pre></p>
<p>Okay, so now let&#8217;s start doing it the hard way. Rather than mapping over the original list, we&#8217;ll loop over the list with <strong>foreach</strong>, and perform a side-effect computation that builds up the two result sequences. This is the sort of thing that is typically done in non-functional languages with <strong>for</strong> loops, hence the use of <strong>foreach</strong> in Scala. We&#8217;ll explore each of these in turn.</p>
<h2>The second variation: use reassignable, immutable Lists</h2>
<p>We can use <em>reassignable</em> variables which are initialized to be empty Lists, and then prepend to them as we loop through the <strong>words</strong> list. We are thus using a variable that has the type of <strong>List</strong>, which is an immutable sequence data structure, but its value is being reassigned each time we pass through the loop.</p>
<p><pre class="brush: scala;">
var wlengthsReassign = List[Int]()
var wcapsReassign = List[Boolean]()
words.foreach { word =&gt;
  wlengthsReassign = word.length :: wlengthsReassign
  wcapsReassign = word(0).isUpper :: wcapsReassign
}

display(&quot;Using reassignable lists.&quot;, wlengthsReassign.reverse, wcapsReassign.reverse)
</pre></p>
<p>Note that we build up the lists by prepending, which means they come out of the loop in reverse order and thus must be reversed before being displayed. You can of course append to a List by creating a singleton List and concatenating the two Lists with the <strong>:::</strong> operator.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = List(4,2)
foo: List[Int] = List(4, 2)

scala&gt; foo ::: List(7)
res0: List[Int] = List(4, 2, 7)
</pre></p>
<p>However, this is not recommended because it is computationally costly. Adding an element to the front (left) of a List is a constant time operation, whereas concatenating two lists requires time proportional to the length of the first list. That might not seem like a big deal until you are dealing with lists with thousands of elements, and then you&#8217;ll find that the same bit of code that prepends many times and then reverses is much faster than one which appends using the above strategy.</p>
<h2>Third variation: use unreassignable, mutable (growable) ListBuffers</h2>
<p>Next, we can use a ListBuffer, which is a mutable sequence data structure that also happens to support constant time append operations. We can thus declare it as a <strong>val</strong>, and then use the method <strong>append</strong> to mutate the sequence so that it has a new element at the end. So, the variables referring to the sequences are <strong>not</strong> reassignable, but their values are mutable.</p>
<p><pre class="brush: scala;">
import collection.mutable.ListBuffer
val wlengthsBuffer = ListBuffer[Int]()
val wcapsBuffer = ListBuffer[Boolean]()
words.foreach { word =&gt;
  wlengthsBuffer.append(word.length)
  wcapsBuffer.append(word(0).isUpper)
}

display(&quot;Using mutable ListBuffer.&quot;, wlengthsBuffer.toList, wcapsBuffer.toList)
</pre></p>
<p>Note that we must convert the <strong>ListBuffers</strong> to <strong>Lists</strong> for the call to <strong>display</strong> in order to have the right types as arguments to that function.</p>
<p>Since they can efficiently grow (i.e., get longer), ListBuffers are a good choice for many problems where we need to accumulate a set of results, and especially when we don&#8217;t know how many results we will be accumulating. However, if you know the number of results you&#8217;ll be accumulating it&#8217;s probably better to use Arrays, as shown next.</p>
<h2>Fourth variation: use unreassignable, mutable (but fixed length) Arrays</h2>
<p>Both of the above alternatives probably look a little strange to people coming from Java. In Java, you&#8217;d be more likely to do an imperative solution that involves initializing arrays that have the same length as words and then filling in respective indices as appropriate. To do this, use <strong>Array.fill(<em>lengthOfArray</em>)(<em>initialValue</em>)</strong>.</p>
<p><pre class="brush: scala;">
val wlengthsArray = Array.fill(words.length)(0)
val wcapsArray = Array.fill(words.length)(false)
  words.indices.foreach { index =&gt;
  wlengthsArray(index) = words(index).length
  wcapsArray(index) = words(index)(0).isUpper
}

display(&quot;Using iteration and arrays.&quot;, wlengthsArray.toList, wcapsArray.toList)
</pre></p>
<p>We go through the indices and for each one compute the value and assign it to the appropriate index in the corresponding Array. Again, we need to convert the results to Lists before calling <strong>display</strong>. The indices method does exactly what you&#8217;d expect &#8212; it gives you the indices of the List.</p>
<p><pre class="brush: scala;">
scala&gt; words.indices
res2: scala.collection.immutable.Range = Range(0, 1, 2, 3, 4, 5, 6, 7)
</pre></p>
<p>A problem with the above <strong>foreach</strong> loop is that it requires indexing into Lists, which is generally a bad idea. Why? Because to get the <em>i</em>-th item from a list requires time proportional to<em> i</em> operations. Why? Because the implementation for obtaining an item at a particular index <em>i</em> involves peeling off the head of the list to get its tail, and then seeking for the <em>i-1</em>-th item of the tail, which requires peeling off its head and then seeking for the <em>i-2</em>-th item, and so on. So, if you want to get the <em>10000</em>th item in a list, you have to perform 10,000 operations to get it. If the words list had 10,000 elements, you can now see that you&#8217;d perform 10,000 basic computations just on the <strong>foreach</strong>, and for each element you do <em>2*index</em> operations to get the word at that index, which means doing 20,000 operations on the <span style="text-decoration:underline;">last</span> index alone.</p>
<p>Note that indexing into Arrays is a constant time operation, so there is no problem with the left hand side of the assignments in the above loop.</p>
<p>You might think you can do better by first storing the word and then using it twice, e.g.</p>
<p><pre class="brush: scala;">
words.indices.foreach { index =&gt;
  val word = words(index)
  wlengthsArray(index) = word.length
  wcapsArray(index) = word(0).isUpper
}
</pre></p>
<p>This is better, but it only saves us half the operations. Since we were perfectly happy to loop over the words themselves before, we actually shouldn&#8217;t have to do this look up &#8212; we can do better by having a <em>reassignable</em> counter index that allows us to set values to the correct positions in the new Arrays we are creating.</p>
<p><pre class="brush: scala;">
val wlengthsArray2 = Array.fill(words.length)(0)
val wcapsArray2 = Array.fill(words.length)(false)
var index = 0
words.foreach { word =&gt;
  wlengthsArray2(index) = word.length
  wcapsArray2(index) = word(0).isUpper
  index += 1
}
</pre></p>
<p>Since this sort of pattern is fairly common, Scala provides a handy method on sequences called <strong>zipWithIndex</strong> which returns a List of the original elements paired with their indices.</p>
<p><pre class="brush: scala;">
scala&gt; words.zipWithIndex
res3: List[(java.lang.String, Int)] = List((This,0), (is,1), (a,2), (list,3), (of,4), (English,5), (words,6), (.,7))
</pre></p>
<p>In this way, we can have the <strong>foreach</strong> loop over such pairs. It is convenient in these cases to use the pattern matching abilities in <strong>foreach</strong> loops by using the <strong>case</strong> match on pairs, as below.</p>
<p><pre class="brush: scala;">
val wlengthsArray3 = Array.fill(words.length)(0)
val wcapsArray3 = Array.fill(words.length)(false)
words.zipWithIndex.foreach { case(word,index) =&gt;
  wlengthsArray3(index) = word.length
  wcapsArray3(index) = word(0).isUpper
}
</pre></p>
<p>It&#8217;s important to understand the cost of the operations you are using, especially in looping contexts where you are inherently doing the same basic operation multiple times.</p>
<h2>Indexed sequences (Vectors)</h2>
<p>It is worth pointing out that when you want an immutable sequence that allows efficient indexing, you should use Vectors.</p>
<p><pre class="brush: scala;">
scala&gt; val bar = Vector(1,2,3)
bar: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
</pre></p>
<p>If you have a <strong>List</strong> in hand but want to index into it repeatedly, you can convert it to a <strong>Vector</strong> using <strong>toIndexedSeq</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = List(4,9,9,2,3,8)
numbers: List[Int] = List(4, 9, 9, 2, 3, 8)

scala&gt; numbers.toIndexedSeq
res5: scala.collection.immutable.IndexedSeq[Int] = Vector(4, 9, 9, 2, 3, 8)
</pre></p>
<p><strong>IndexedSeq</strong> is a supertype of sequences which are designed to be efficient for indexing, and <strong>Vector</strong> is the default &#8220;backing&#8221; implementation when you call <strong>toIndexedSeq</strong> on a <strong>List</strong>.</p>
<p>Of course, if you are only ever going over all the elements of a sequence in order, then Lists are likely to be preferable since they have a bit less overhead and they have some nice properties for pattern matching in match statements.</p>
<h2>Using predefined funtions for mapping over a sequence</h2>
<p>Another thing worth pointing out is that if you have a predefined function, you can pass that as the argument to <strong>map</strong>, which can lead to very concise code for this task. Assume you have defined a function that takes a <strong>String</strong> and produces a <strong>Tuple</strong> of its argument&#8217;s length and whether it starts with an upper-case letter.</p>
<p><pre class="brush: scala;">
def getLengthAndUpper = (word: String) =&gt; (word.length, word(0).isUpper)
</pre></p>
<p>The code for mapping over words with this function to get our desired lists is then very clean.</p>
<p><pre class="brush: scala;">
val (wlengthsFunction, wcapsFunction) = words.map(getLengthAndUpper).unzip
</pre></p>
<p>Of course, you would probably only do this if you needed that same function in other places. If not, it&#8217;s preferable to just use the anonymous function like in the first <strong>map</strong> example in this blog post. However, you can see that if you have a library of simple functions like this, you can now start writing much clearer and simpler code by <span style="text-decoration:underline;">reusing</span> them when mapping over different lists.</p>
<h2>For expressions</h2>
<p>Notice that the previous loops were all <strong>foreach</strong> ones, whereas Java programmers and Pythonistas will be used to <strong>for</strong> loops. Scala doesn&#8217;t have <strong>for</strong> <em>loops</em> &#8212; it has <strong>for</strong> <em>expressions</em>. A common question then is: What&#8217;s the difference? What is a <strong>for</strong> expression for and why isn&#8217;t it a <strong>for</strong> loop? The difference is that<span style="text-decoration:underline;"> an expression returns a value</span>, so while <strong>foreach</strong> allows you to plow through a sequence and do some operation to each element, a <strong>for</strong> expression allows you to return a value for each element. Consider the following, in which we <em>yield</em> the square of each integer in a <strong>List[Int]</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = List(4,9,9,2,3,8)
numbers: List[Int] = List(4, 9, 9, 2, 3, 8)

scala&gt; for (num &lt;- numbers) yield num*num
res6: List[Int] = List(16, 81, 81, 4, 9, 64)
</pre></p>
<p>We get a result, whereas a <strong>foreach</strong> loop just does the computation and returns nothing.</p>
<p><pre class="brush: scala;">
scala&gt; numbers.foreach { num =&gt; num * num }
</pre></p>
<p>The key is that we yield a value for each element in the <strong>for</strong> expression. In this case, it is basically equivalent to using <strong>map</strong>. Here it is in the context of the running <strong>words</strong> example.</p>
<p><pre class="brush: scala;">
val (wlengthsFor, wcapsFor) =
  (for (word &lt;- words) yield (word.length, word(0).isUpper)).unzip

display(&quot;Using a for expression.&quot;, wlengthsFor, wcapsFor)
</pre></p>
<p>Having said all this, it turns out you <span style="text-decoration:underline;">can</span> use a <strong>for</strong> expression as a loop, without returning any values, e.g. as follows.</p>
<p><pre class="brush: scala;">
scala&gt; for (num &lt;- numbers) { println(num*num) }
16
81
81
4
9
64
</pre></p>
<p>I think it is generally better to use a <strong>foreach</strong> loop for such cases so that it is clear that you are only performing side-effects, like printing, reassigning the values of <strong>var</strong> variables, or modifying mutable data structures. However, there are some cases where a <strong>for</strong> expression can be more convenient, for example when working through multiple lists and doing various filtering operations. Here&#8217;s a quick example to give a flavor of this. Given two lists, we can enumerate the cross product of all their elements</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = List(4,9,9,2,3,8)
numbers: List[Int] = List(4, 9, 9, 2, 3, 8)

scala&gt; val letters = List('a','C','f','d','z')
letters: List[Char] = List(a, C, f, d, z)

scala&gt; for (n &lt;- numbers; l &lt;- letters) print(&quot;(&quot; + n + &quot;,&quot; + l + &quot;) &quot;)
(4,a) (4,C) (4,f) (4,d) (4,z) (9,a) (9,C) (9,f) (9,d) (9,z) (9,a) (9,C) (9,f) (9,d) (9,z) (2,a) (2,C) (2,f) (2,d) (2,z) (3,a) (3,C) (3,f) (3,d) (3,z) (8,a) (8,C) (8,f) (8,d) (8,z)
</pre></p>
<p>You can filter on these values as well to restrict the output to just some reduced set of elements of inter(est.</p>
<p><pre class="brush: scala;">
scala&gt; for (n &lt;- numbers; if (n&gt;4); l &lt;- letters) print(&quot;(&quot; + n + &quot;,&quot; + l + &quot;) &quot;)
(9,a) (9,C) (9,f) (9,d) (9,z) (9,a) (9,C) (9,f) (9,d) (9,z) (8,a) (8,C) (8,f) (8,d) (8,z)
</pre></p>
<p>There is much more to this, but I&#8217;ll leave it here since using <strong>for</strong> expressions in this way is a rich enough topic for several blog posts in and of itself. Also, there is a detailed discussion of it in Odersky, Spoon, and Venner&#8217;s book &#8220;Programming in Scala.&#8221;</p>
<h2>Fifth variation: use a recursive function</h2>
<p>It&#8217;s worth pointing out one other way of building up lengths and caps lists. Recursive functions are functions which look at their input and then either return a result for a base case or compute a result and then call <span style="text-decoration:underline;">themself</span> with that result. It&#8217;s pretty standard stuff that computer scientists love and which tends to get used a lot more in functional programming than in imperative programming. Here, I&#8217;ll show how to do the same task done before using recursion, but without an in-depth explanation, so either you&#8217;ll already know how to do recursion and you can see it in Scala for the same problem context as above, or you don&#8217;t know much about recursion but can use this as an example of how it is employed for a task you already understand from the vantages given above. So, in the later case, hopefully it will be useful in conjunction with other tutorials on recursion.</p>
<p>First, we need to define the recursive function, given below. It has three parameters: one for the list of words, one for the already computed lengths and another for the already computed caps. It returns a pair that has first the list of lengths with one additional item prepended to it and then the list of caps values with one additional item prepended to it. The items being prepended are computed from the head of the <strong>inputWords</strong> list.</p>
<p><pre class="brush: scala;">
def lengthCapRecursive(
  inputWords: List[String],
  lengths: List[Int],
  caps: List[Boolean]): (List[Int], List[Boolean]) = inputWords match {

  case Nil =&gt;
    (lengths, caps)
  case head :: tail =&gt;
    lengthCapRecursive(tail, head.length :: lengths, head(0).isUpper :: caps)
}
</pre></p>
<p>We can call this function directly, but it is often convenient to provide a secondary function that makes the initial call to this function with empty result lists as the second and third parameters. The secondary function can then perform the reversal and return the desired computed lists.</p>
<p><pre class="brush: scala;">
def lengthCapRecursive(inputWords: List[String]): (List[Int], List[Boolean]) = {
val (l,c) = lengthCapRecursive(words, List[Int](), List[Boolean]())
(l.reverse, c.reverse)
}
</pre></p>
<p>Getting the result is then just a matter of calling that function with our <strong>words</strong> list.</p>
<p><pre class="brush: scala;">
val (wlengthsRecursive, wcapsRecursive) = lengthCapRecursive(words)

display(&quot;Using a recursive function.&quot;, wlengthsRecursive, wcapsRecursive)
</pre></p>
<p>A slight variation on this that is slightly cleaner is to &#8220;hide&#8221; the recursive function inside the secondary function, which then effectively acts as a wrapper to the recursive function. This is often considered cleaner because the programmer can ensure that the initialization is done correctly and that the recursive function itself isn&#8217;t given malformed inputs.</p>
<p><pre class="brush: scala;">
def lengthCapRecurWrap(inputWords: List[String]): (List[Int], List[Boolean]) = {

  // This function is hidden from code that doesn't
  def lengthCapRecurHelp(
    inputWords: List[String],
    lengths: List[Int],
    caps: List[Boolean]): (List[Int], List[Boolean]) = inputWords match {

    case Nil =&gt;
      (lengths, caps)
    case head :: tail =&gt;
      lengthCapRecurHelp(tail, head.length :: lengths, head(0).isUpper :: caps)
  }

  val (l,c) = lengthCapRecursive(words, List[Int](), List[Boolean]())
  (l.reverse, c.reverse)

}

val (wlengthsRecurWrap, wcapsRecurWrap) = lengthCapRecurWrap(words)

display(&quot;Using a recursive function contained in a wrapper.&quot;, wlengthsRecurWrap, wcapsRecurWrap)
</pre></p>
<h2>Conclusion</h2>
<p>So, that provides an overview of different ways of obtaining the same results and some explanation of the different properties of each solution in terms of computational considerations that are likely to crop up in your code and you should be aware of.</p>
<p>Clearly there are many ways of getting the same thing done in Scala. This can be hard for newcomers to the language since they don&#8217;t have good intuitions about which approach is better in different circumstances, but it is quite valuable to have these options as you become more savvy and understand what the costs and benefits of using different data structures and different ways of iterating are.</p>
<p>All of the code from the above snippets are gathered together in the Github gist <a href="https://gist.github.com/1830413">ListComputations.scala</a>. You can save it as a file and run it as &#8220;<strong>scala ListComputations.scala</strong>&#8220;  to see the output and play around with modifications to the code.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/346/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/346/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/346/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=346&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2012/02/14/variations-for-computing-results-from-sequences-in-scala/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>Grand Christmas Draw</title>
		<link>https://bcomposes.wordpress.com/2011/12/25/grand-christmas-draw-3/</link>
		<comments>https://bcomposes.wordpress.com/2011/12/25/grand-christmas-draw-3/#comments</comments>
		<pubDate>Mon, 26 Dec 2011 04:05:23 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[Random thoughts]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=208</guid>
		<description><![CDATA[It&#8217;s Christmas day, and I thought I&#8217;d share a wee thing of interest that came my way this week. I was a graduate student at the University of Edinburgh in Scotland from 1999 to 2002, and during that time I purchased a great deal of my clothes at thrift shops around the town &#8212; it &#8230;<p><a href="https://bcomposes.wordpress.com/2011/12/25/grand-christmas-draw-3/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=208&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s Christmas day, and I thought I&#8217;d share a <a href="http://languagelog.ldc.upenn.edu/nll/?p=3650">wee</a> thing of interest that came my way this week. I was a graduate student at the University of Edinburgh in Scotland from 1999 to 2002, and during that time I purchased a great deal of my clothes at thrift shops around the town &#8212; it definitely made the pounds/dollars stretch quite a bit more. In addition to clothes, these shops sold used books, and I sometimes couldn&#8217;t resist picking one or two up. One of those was <a href="http://en.wikipedia.org/wiki/Watership_Down">Watership Down,</a> a children&#8217;s book about rabbits at war (really!) that I had loved as a kid. I happened to pick it up from my bookshelf the other day. I leafed through it, curious to see how long I might have to wait until I can read it to my now two-and-a-half year old son. Inside, I found a bit of history in the form of the following raffle ticket for the Edinburgh Wanderers rugby union team, from 1978:</p>
<div id="attachment_343" class="wp-caption alignnone" style="width: 509px"><a href="http://bcomposes.files.wordpress.com/2011/12/wanderers1.jpg"><img class=" wp-image-343" title="wanderers" src="http://bcomposes.files.wordpress.com/2011/12/wanderers1.jpg?w=499&#038;h=351" alt="" width="499" height="351" /></a><p class="wp-caption-text">Edinburgh Wanderers 1978 Raffle Ticket</p></div>
<p>It turns out that the Edinburgh Wanderers is no more: they merged with Murrayfield RFC to become the <a href="http://en.wikipedia.org/wiki/Murrayfield_Wanderers_RFC">Murrayfield Wanderers RFC</a> in 1997. They play at Murrayfield Stadium, which I biked past from time to time on my way to rent cars to take out into the Scottish Highlands (one of the many great things about living and studying in Edinburgh).</p>
<p>I love the second and third prizes, and the &#8220;etc., etc.&#8221; reduplication. What is a &#8220;giant food hamper&#8221; anyway?! Exactly how big is &#8220;giant&#8221; in that context? And a gallon of whisky doesn&#8217;t sound like a bad third prize, depending on which distillery made it. It might not have been too shabby, given that 100 pounds in 1978 would be worth 450 to 750 pounds today. Some <a href="http://en.wikipedia.org/wiki/Caol_Ila">Caol Ila</a>, perhaps?</p>
<p>I can only presume that this ticket was not a winner and was thus relegated to being a bookmark&#8230; I&#8217;ll be happy to keep using it as such for myself now.</p>
<p>With that, I simply say: a very Merry Christmas and a Grand Christmas Draw to all!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/208/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/208/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/208/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=208&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/12/25/grand-christmas-draw-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>

		<media:content url="http://bcomposes.files.wordpress.com/2011/12/wanderers1.jpg?w=300" medium="image">
			<media:title type="html">wanderers</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 12</title>
		<link>https://bcomposes.wordpress.com/2011/11/14/first-steps-in-scala-for-beginning-programmers-part-12/</link>
		<comments>https://bcomposes.wordpress.com/2011/11/14/first-steps-in-scala-for-beginning-programmers-part-12/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 17:47:01 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=194</guid>
		<description><![CDATA[Topics: code blocks, coding style, closures, scala documentation project Preface This is part 12 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. This post isn&#8217;t so much a &#8230;<p><a href="https://bcomposes.wordpress.com/2011/11/14/first-steps-in-scala-for-beginning-programmers-part-12/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=194&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>code blocks, coding style, closures, scala documentation project<br />
</em></p>
<h2>Preface</h2>
<p>This is part 12 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.</p>
<p>This post isn&#8217;t so much a tutorial as a comment on coding style with a few pointers on how code blocks in Scala work. It was instigated by patterns I was noting in my students&#8217; code; namely, that they were packing <strong>everything</strong> into one-liners with map after map with map after map, etc. These  map-over-mapValues-over-map sequences of statements can be almost incomprensible, both for some other person reading the code, and even for the person writing the code. I do admit to a fair amount of guilt in using such sequences of operations in class lectures and even in some of these tutorials. It works well in the REPL and when you have lots of text to explain what is going on around the piece of code in question, but it seems to have given a bad model for writing actual code. Oops!</p>
<p>So taking a step back, it is important to break operation sequences up a bit, but it isn&#8217;t always obvious to beginners how one can do so. Also, some students indicated that they had gotten the impression that one should try to pack everything onto one line if possible, and that breaking things up was somehow less advanced or less Scala-like. This is hardly the case. In fact much to the contrary: it is crucial to use strategies that allow readers of your code to see the logic behind your statements. This isn&#8217;t just for others &#8212; you are likely to be a reader of your own code, often months after you originally wrote it, and you want to be kind to your future self.</p>
<h2>A simple example</h2>
<p>I&#8217;m giving an example here. of what you can do to give your code more breathing space. It&#8217;s not a very meaningful example, but it serves the purpose without being very complex. We begin by creating a list of all the letters in the alphabet.</p>
<p><pre class="brush: scala;">

scala&gt; val letters = &quot;abcdefghijklmnopqrstuvwxyz&quot;.split(&quot;&quot;).toList.tail
letters: List[java.lang.String] = List(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z)

</pre></p>
<p>Okay, now here&#8217;s our (pointless) task: we want to create a map from every letter (from &#8216;a&#8217; to &#8216;x&#8217;) to a list containing that letter and the two letters that follow it in reverse alphabetical order. (Did I mention this was a pointless task in and of itself?) Here&#8217;s a one-liner that can do it.</p>
<p><pre class="brush: scala;">

scala&gt; letters.zip((1 to 26).toList.sliding(3).toList).toMap.mapValues(_.map(x =&gt; letters(x-1)).sorted.reverse)
res0: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(e -&gt; List(g, f, e), s -&gt; List(u, t, s), x -&gt; List(z, y, x), n -&gt; List(p, o, n), j -&gt; List(l, k, j), t -&gt; List(v, u, t), u -&gt; List(w, v, u), f -&gt; List(h, g, f), a -&gt; List(c, b, a), m -&gt; List(o, n, m), i -&gt; List(k, j, i), v -&gt; List(x, w, v), q -&gt; List(s, r, q), b -&gt; List(d, c, b), g -&gt; List(i, h, g), l -&gt; List(n, m, l), p -&gt; List(r, q, p), c -&gt; List(e, d, c), h -&gt; List(j, i, h), r -&gt; List(t, s, r), w -&gt; List(y, x, w), k -&gt; List(m, l, k), o -&gt; List(q, p, o), d -&gt; List(f, e, d))
</pre></p>
<p>That did it, but that one-liner isn&#8217;t clear at all, so we should break things up a bit. Also, what is &#8220;_&#8221; and what is &#8220;x&#8221;? (By which I mean, what are they in terms of the <strong>logic</strong> of the program? We know they are ways of referring to the elements being mapped over, but they don&#8217;t help the human reading the code understand what is going on.)</p>
<p>Let&#8217;s start by creating the sliding list of number ranges.</p>
<p><pre class="brush: scala;">

scala&gt; val ranges = (1 to 26).toList.sliding(3).toList
ranges: List[List[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5), List(4, 5, 6), List(5, 6, 7), List(6, 7, 8), List(7, 8, 9), List(8, 9, 10), List(9, 10, 11), List(10, 11, 12), List(11, 12, 13), List(12, 13, 14), List(13, 14, 15), List(14, 15, 16), List(15, 16, 17), List(16, 17, 18), List(17, 18, 19), List(18, 19, 20), List(19, 20, 21), List(20, 21, 22), List(21, 22, 23), List(22, 23, 24), List(23, 24, 25), List(24, 25, 26))

</pre></p>
<p>It&#8217;s quite clear what that is now. (The <strong>sliding</strong> function is a beautiful thing, especially for natural language processing problems.)</p>
<p>Next, we zip the letters with the ranges and create a <strong>Map</strong> from the pairs using <strong>toMap</strong>. This produces a Map from letters to lists of three numbers. Note that the lengths of the two lists are different: <em>letters</em> has 26 elements and <em>ranges</em> has 24, which means that the last two elements of <em>letters</em> (&#8216;y&#8217; and &#8216;z&#8217;) get dropped in the zipped list.</p>
<p><pre class="brush: scala;">

scala&gt; val letter2range = letters.zip(ranges).toMap
letter2range: scala.collection.immutable.Map[java.lang.String,List[Int]] = Map(e -&gt; List(5, 6, 7), s -&gt; List(19, 20, 21), x -&gt; List(24, 25, 26), n -&gt; List(14, 15, 16), j -&gt; List(10, 11, 12), t -&gt; List(20, 21, 22), u -&gt; List(21, 22, 23), f -&gt; List(6, 7, 8), a -&gt; List(1, 2, 3), m -&gt; List(13, 14, 15), i -&gt; List(9, 10, 11), v -&gt; List(22, 23, 24), q -&gt; List(17, 18, 19), b -&gt; List(2, 3, 4), g -&gt; List(7, 8, 9), l -&gt; List(12, 13, 14), p -&gt; List(16, 17, 18), c -&gt; List(3, 4, 5), h -&gt; List(8, 9, 10), r -&gt; List(18, 19, 20), w -&gt; List(23, 24, 25), k -&gt; List(11, 12, 13), o -&gt; List(15, 16, 17), d -&gt; List(4, 5, 6))

</pre></p>
<p>Note that we could have broken this into two steps, first creating the zipped list and then calling <strong>toMap</strong> on it. However, it is perfectly clear what the intent is when one zips two lists (creating a list of pairs) and then uses <strong>toMap</strong> on it immediately, so this is certainly a case where it makes sense to put multiple operations on a single line.</p>
<p>At this point we could of course process the <em>letter2range</em> <strong>Map</strong> using a one-liner.</p>
<p><pre class="brush: scala;">

scala&gt; letter2range.mapValues(_.map(x =&gt; letters(x-1)).sorted.reverse)
res1: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(e -&gt; List(g, f, e), s -&gt; List(u, t, s), x -&gt; List(z, y, x), n -&gt; List(p, o, n), j -&gt; List(l, k, j), t -&gt; List(v, u, t), u -&gt; List(w, v, u), f -&gt; List(h, g, f), a -&gt; List(c, b, a), m -&gt; List(o, n, m), i -&gt; List(k, j, i), v -&gt; List(x, w, v), q -&gt; List(s, r, q), b -&gt; List(d, c, b), g -&gt; List(i, h, g), l -&gt; List(n, m, l), p -&gt; List(r, q, p), c -&gt; List(e, d, c), h -&gt; List(j, i, h), r -&gt; List(t, s, r), w -&gt; List(y, x, w), k -&gt; List(m, l, k), o -&gt; List(q, p, o), d -&gt; List(f, e, d))

</pre></p>
<p>This is better than what we started with because we at least know what <em>letter2range</em> is, but it still isn&#8217;t clear what is going on after that. To make this more comprehensible, we can break it up over multiple lines and give more descriptive names to the variables. The following produces the same result as above.</p>
<p><pre class="brush: scala;">

letter2range.mapValues (
  range =&gt; {
    val alphavalues = range.map (number =&gt; letters(number-1))
    alphavalues.sorted.reverse
  }
)

</pre></p>
<p>Notice that:</p>
<ul>
<li>I called it <em>range</em> rather than <em>_</em> which is a better indicator of what <strong>mapValues</strong> is working with.</li>
<li>After the =&gt; I use an open left bracket {</li>
<li>The next lines are a block of code that I can use like any block of code, which means I can create variables and break things down into smaller, more understandable steps. For example the line creating <em>alphavalues</em> makes it clear that we are taking a range and mapping it to the corresponding indices in the <em>letters</em> list (e.g., the range 2, 3, 4 becomes &#8216;b&#8217;,'c&#8217;,'d&#8217;). For such a list, we then sort and reverse it (okay, so it started out sorted, but you can imagine plenty of times you need to do such sorting).</li>
<li>The last line of that block is what the result of the overall <strong>mapValue</strong> for that element (here, indicated by the variable <em>range</em>) is.</li>
</ul>
<p>Basically, we get a lot more breathing room, and this becomes even more essential as you dig deeper or do more complex operations during a map-within-a-map operation. Having said that, you should ask yourself whether you should just create and use a function that has a clear semantics and does the job for you. For example, here&#8217;s an alternative to the above strategy that is perhaps clearer.</p>
<p><pre class="brush: scala;">

def lookupSortAndReverse (range: List[Int], alpha: List[String]) =
  range.map(number =&gt; alpha(number-1).sorted.reverse)

</pre></p>
<p>We&#8217;ve defined a function that takes a range and a list of letters (called alpha in the function) and produces the sorted and reversed list of letters corresponding to the numbers in the range. In other words, it is what the anonymous function defined after <em>range</em> in the previous code block did. We can thus easily use it at the top-level <strong>mapValue</strong> operation with completely clear intent and comprehensibility.</p>
<p><pre class="brush: scala;">
letter2range.mapValues(range =&gt; lookupSortAndReverse(range, letters))
</pre></p>
<p>Of course, you should especially consider creating such functions if you use the same operation in multiple places.</p>
<h2>Closures</h2>
<p>One further final note. Note that I passed the <em>letters</em> list into the <strong>lookupSortAndReverse</strong> function such that its value was bound to the function internal variable <em>alpha</em>. You may wonder whether I needed to include that, or whether it is possible to directly access the letters list in the function. In fact you can: <span style="text-decoration:underline;">provided that <em>letters</em> has already been defined</span>, we can do the following.</p>
<p><pre class="brush: scala;">
def lookupSortAndReverseCapture (range: List[Int]) =
  range.map(number =&gt; letters(number-1).sorted.reverse)

letter2range.mapValues(range =&gt; lookupSortAndReverseCapture(range))
</pre></p>
<p>This is called a <strong>closure</strong>, meaning that the function has incorporated free variables (here, <em>letters</em>) that come from <em>outside</em> its own scope. I generally don&#8217;t use this strategy with named functions like this, but there are many natural situations for using closures. In fact you do it all the time when you are creating anonymous functions as arguments to functions like <strong>map</strong> and <strong>mapValue</strong> and their cousins. As a reminder, here was the map-within-a-mapValue anonymous function we defined before.</p>
<p><pre class="brush: scala;">
letter2range.mapValues (
  range =&gt; {
    val alphavalues = range.map (number =&gt; letters(number-1))
    alphavalues.sorted.reverse
  }
)

</pre></p>
<p>The <em>letters</em> variable has been &#8220;closed over&#8221; in the anonymous function<em> range =&gt; { &#8230; }</em>, which is not very different from what we did with the closure-style <strong>lookupSortAndReverse</strong> function.</p>
<h2>All the code in one spot</h2>
<p>Since there are some dependencies between the different steps in this tutorial that could get things mixed up, here&#8217;s all the code in one spot such that you can run it easily.</p>
<p><pre class="brush: scala;">

// Get a list of the letters
val letters = &quot;abcdefghijklmnopqrstuvwxyz&quot;.split(&quot;&quot;).toList.tail

// Now create a list that maps each letter to a list containing itself
// and the two letters after it, in reverse alphabetical
// order. (Bizarre, but hey, it's a simple example. BTW, we lose y and
// z in the process.)

letters.zip((1 to 26).toList.sliding(3).toList).toMap.mapValues(_.map(x =&gt; letters(x-1)).sorted.reverse)

// Pretty unintelligible. Let's break things up a bit

val ranges = (1 to 26).toList.sliding(3).toList
val letter2range = letters.zip(ranges).toMap
letter2range.mapValues(_.map(x =&gt; letters(x-1)).sorted.reverse)

// Okay, that's better. But it is easier to interpret the latter if we break things up a bit

letter2range.mapValues (
  range =&gt; {
    val alphavalues = range.map (number =&gt; letters(number-1))
    alphavalues.sorted.reverse
  }
)

// We can also do the one-liner coherently if we have a helper function.

def lookupSortAndReverse (range: List[Int], alpha: List[String]) =
  range.map(number =&gt; alpha(number-1).sorted.reverse)

letter2range.mapValues(range =&gt; lookupSortAndReverse(range, letters))

// Note that we can &quot;capture&quot; the letters value, though this makes the
// requires letters to be defined before lookupSortAndReverse in the
// program.

def lookupSortAndReverseCapture (range: List[Int]) =
  range.map(number =&gt; letters(number-1).sorted.reverse)

letter2range.mapValues(range =&gt; lookupSortAndReverseCapture(range))

</pre></p>
<h2>Wrapup</h2>
<p>Hopefully this will encourage you to use clearer coding style and demonstrates some aspects of code blocks that you may not have realized. However, this just scratches the surface of writing clearer code, and a lot of it will just come with time and practice and realizing how necessary it is when you look back at code you wrote months ago.</p>
<p>Note that one easy thing you can do to create better code is to try to stick established coding conventions. For example, see t<a href="http://docs.scala-lang.org/style/">he coding guidelines for Scala</a> on the <a href="http://docs.scala-lang.org/">Scala documentation project</a>. There is also a lot of other very useful stuff, including tutorials, and it is actively evolving and growing!</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/194/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/194/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/194/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=194&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/11/14/first-steps-in-scala-for-beginning-programmers-part-12/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 11</title>
		<link>https://bcomposes.wordpress.com/2011/10/26/first-steps-in-scala-for-beginning-programmers-part-11/</link>
		<comments>https://bcomposes.wordpress.com/2011/10/26/first-steps-in-scala-for-beginning-programmers-part-11/#comments</comments>
		<pubDate>Wed, 26 Oct 2011 18:55:55 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=183</guid>
		<description><![CDATA[Topics: SBT, scalabha, packages, build systems Preface This is part 11 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. This tutorial gives an introduction to building Scala applications &#8230;<p><a href="https://bcomposes.wordpress.com/2011/10/26/first-steps-in-scala-for-beginning-programmers-part-11/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=183&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>SBT, scalabha, packages, build systems</em></p>
<h2>Preface</h2>
<p>This is part 11 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.</p>
<p>This tutorial gives an introduction to building Scala applications using <a href="https://github.com/harrah/xsbt/wiki">SBT</a> (the Simple Build Tool). This will be done in the context of the <a href="https://bitbucket.org/jasonbaldridge/scalabha/overview">Scalabha</a> package, which I have created for primarily for <a href="http://icl-f11.utcompling.com/">my Introduction to Computational Linguistics class</a>. Some supporting code is available in Scalabha for some basic natural language processing tasks; most relevant at the moment is the code that is in Scalabha that supports <a href="http://icl-f11.utcompling.com/assignments/hw4-part-of-speech-tagging">the part-of-speech tagging homework</a> for the class.</p>
<p><a title="First steps in Scala for beginning programmers, Part 10" href="../2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/">The previous tutorial </a>showed how Scala code can be compiled with <em>scalac</em> and then run with <em>scala</em>. One problem we ended up with is that there were generated class files littering the working directory. Another thing we did not discuss is how a large system can be created in a modular way that organizes code and classes. For example, you might want to have code in different directories generate classes that can be used by one another. You also may want want to incorporate classes from other libraries into your own code. The solutions we’ll discuss to address these needs and more are build systems and packages.</p>
<p><em>Note</em>: The tutorial assumes you are using some version of Unix. If you are on Windows, you should consider using Cygwin, or you could <a href="https://help.ubuntu.com/community/WindowsDualBoot">dual boot your computer</a>.</p>
<p><em>Note</em>: In this tutorial, I’ll assume you are using as simple text editor to modify files. However, note that the general setup you are working with here can be used from more powerful Integrated Developer Environements (IDEs) like Eclipse, IntelliJ, and NetBeans.</p>
<h2>Setting up Scalabha</h2>
<p>We&#8217;ll work with SBT, which is perhaps the most popular build tool for Scala.  The Scalabha toolkit mentioned earlier uses SBT (version 0.11.0), so we&#8217;ll discuss SBT in the Scalabha context.</p>
<p>The first thing you need to do is download <a href="https://bitbucket.org/jasonbaldridge/scalabha/downloads/scalabha-0.1.1-src.zip">Scalabha v0.1.1</a> Next unzip the file, change to the directory it unpacked to, and list the directory contents.</p>
<p><pre class="brush: bash;">
$ unzip scalabha-0.1.1-src.zip
Archive:  scalabha-0.1.1-src.zip
&lt;lots of output&gt;
$ cd scalabha-0.1.1
$ ls
CHANGES.txt README      build.sbt   project
LICENSE     bin         data        src
</pre></p>
<p>Briefly, these contents are:</p>
<ul>
<li><strong>README</strong>: A text file describing how to install Scalabha on your machine.</li>
<li><strong>LICENSE</strong>: A text file giving the license, which is the Apache Software License 2.0.</li>
<li><strong>CHANGES.tx</strong>t: A text file describing the modifications made for each version (not much so far).</li>
<li><strong>build.sbt</strong>: A text file that contains instructions for SBT regarding how to build Scalabha</li>
<li><strong>bin</strong>: A directory that contains the scalabha script, which will be used to run applications developed within the Scalabha build system and also to run SBT itself. It also contains sbt-launch-0.11.0.jar, which is a bottled up package of SBT’s classes that will allow us to use SBT very easily. There are some other files that are Perl scripts that are relevant for a research project and aren’t important here.</li>
<li><strong>data</strong>: A directory containing part-of-speech tagged data for English and Czech that forms the basis for the fourth homework of my Introduction to Computational Linguistics course this semester.</li>
<li><strong>project</strong>: A directory containing a single file “plugins.sbt” which tells SBT to use the Assembly plugin. More on this later.</li>
<li><strong>src</strong>: The most important directory of all — it contains the source code of the Scalabha system, and is where you’ll be adding some code as you work with SBT.</li>
</ul>
<p>At this point you should read the <strong>README</strong> and get Scalabha set up on your computer, including building the system from source. In this tutorial, I will give some extra details on using SBT and code development with it, complementing and extending the brief information given in the README.</p>
<p>Note that I will refer the environment variable <strong>SCALABHA_DIR</strong> below. As specified in the README, you should set this variable’s value to be where you unpacked Scalabha. For example, for me this directory is <strong>~/devel/scalabha</strong>.</p>
<p><em>Tip</em>: to make it so that you don’t have to set your environment variables every time you open a new shell, you can set environment variables in your <strong>~/.profile</strong> (Mac, Cygwin) or <strong>~/.bash_aliases</strong> (Ubuntu) files. For example, this is in my profile files on my machines.</p>
<p><pre class="brush: bash;">
export SCALABHA_DIR=$HOME/devel/scalabha
export PATH=$PATH:$SCALABHA_DIR/bin
</pre></p>
<h2>SBT: The Simple Build Tool</h2>
<p>This is not a tutorial about setting up a project to use SBT — it is simply about how to use a project that is already set up for SBT. So, if you are looking for resources about learning SBT, what you’ll mainly find are resources to help programmers configure SBT for their project. These will likely confuse you (the Simple Build Tool is not so simple any more, when it comes to configuration). Using it is straightforward, but the kind of know-how that experienced coders have with using something like SBT is what you probably won’t find much help on. Here, I intend to give the basics so that you have a better starting point for doing more with SBT.</p>
<p>First off, there is a bit of slight of hand with Scalabha that could be confusing. Rather than having users install SBT themselves, I have put the jar file for SBT in the <em>bin</em> directory of Scalabha; then, the <em>scalabha</em> executable (in that same directory) can pick that up and use it to run SBT. (My students and I have set up a number of Scala/Java projects in this way, including <a href="https://bitbucket.org/jasonbaldridge/fogbow">Fogbow</a>, <a href="http://code.google.com/p/junto/">Junto</a>, <a href="https://bitbucket.org/utcompling/textgrounder">Textgrounder</a>, and <a href="https://bitbucket.org/speriosu/updown/wiki/Home">Updown</a>.) The <em>scalabha</em> executable has a number of execution targets (more on this later), and one of these is “<em>build</em>“. When you call scalabha’s <em>build</em> target, it invokes SBT and drops you into the SBT interface.</p>
<p>Do the following, in your <em>SCALABHA_DIR</em>.</p>
<p><pre class="brush: bash;">
$ scalabha build
[info] Loading project definition from /Users/jbaldrid/devel/scalabha/project
[info] Set current project to Scalabha (in build file:/Users/jbaldrid/devel/scalabha/)
&gt;
</pre></p>
<p>You could have achieved the same by downloading SBT and running it according to the instructions for SBT, but this setup saves you that trouble and ensures that you get the right version of SBT. It is just worth pointing out so that you don’t think that Scalabha is SBT –  SBT is entirely independent of Scalabha.</p>
<p>If you have had any trouble with the Scalabha setup, you can create <a href="https://bitbucket.org/jasonbaldridge/scalabha/issues">an issue on the Scalabha Bitbucket site</a>. That just means that I’ll get a notice that you had some problems and can hopefully help you out. And, it is possible that someone else will have had the same problem, in which case you might find your answer there. Most of the problems with this sort of setup are due to confusions about environment variables and unfamiliarity with command line tools.</p>
<h2>Compiling with SBT</h2>
<p>Let’s actually do something with SBT now. If you successfully got through the README, you will have already done what is next, but I’ll give some more details about what is going on.</p>
<p>Because you may have run some SBT actions already as part of doing the README, start out by running the “<em>clean</em>” action so that we’re on the same page.</p>
<p><pre class="brush: bash;">
&gt; clean
[success] Total time: 0 s, completed Oct 26, 2011 10:18:08 AM
</pre></p>
<p>Then, run the &#8220;<em>compile</em>&#8221; action.</p>
<p><pre class="brush: bash;">
&gt; compile
[info] Updating {file:/Users/jbaldrid/devel/scalabha/}default-86efd0...
[info] Done updating.
[info] Compiling 13 Scala sources to /Users/jbaldrid/devel/scalabha/target/classes...
[success] Total time: 9 s, completed Oct 26, 2011 10:18:19 AM
</pre></p>
<p>In another shell (which means another command line window), go to <em>SCALABHA_DIR</em> and list the contents of the directory. You’ll see that two new directories have been created, <em>lib_managed</em> and <em>target</em>. The first is where other libraries have been download from the internet and placed into the Scalabha project space so that they can be easily used — don’t worry about this for the time being. The second is where the compiled class files have gone. To see some example class files, do the following.</p>
<p><pre class="brush: bash;">
$ ls target/classes/opennlp/scalabha/postag/
BaselineTagger$$anonfun$tag$1.class
BaselineTagger.class
EnglishTagInfo$$anonfun$zipWithTag$1$1.class
&lt;... many more class files ...&gt;
RuleBasedTagger$$anonfun$tag$2.class
RuleBasedTagger$$anonfun$tagWord$1.class
RuleBasedTagger.class
</pre></p>
<p>These were generated from the following source files.</p>
<p><pre class="brush: bash;">
$ ls src/main/scala/opennlp/scalabha/postag/
HmmTagger.scala PosTagger.scala
</pre></p>
<p>Open up <strong>PosTagger.scala</strong> in a text editor and look at it — you’ll see the class and object definitions that were the sources for the generated class files in the <em>target/classes</em> directory. Basically, SBT has conveniently handled the separation of source and compile class files so that we don’t have the class files littering our work space.</p>
<p>How does SBT know where the class files are? Simple: it is configured to look at <em>src/main/scala</em> and compile every <em>.scala</em> file it finds under that directory. In just a bit, you’ll start adding your own scala files and be able to compile and run them as part of the Scalabha build system.</p>
<p>Next, at the SBT prompt, invoke the “<em>package</em>” action.</p>
<p><pre class="brush: bash;">
&gt; package
[info] Updating {file:/Users/jbaldrid/devel/scalabha/}default-86efd0...
[info] Done updating.
[info] Packaging /Users/jbaldrid/devel/scalabha/target/scalabha-0.1.1.jar ...
[info] Done packaging.
[success] Total time: 0 s, completed Oct 26, 2011 10:19:02 AM
</pre></p>
<p>In the shell prompt that we used to list files previously, list the contents of the target directory.</p>
<p><pre class="brush: bash;">
$ ls target/
cache              classes            scalabha-0.1.1.jar streams
</pre></p>
<p>You have just created <em>scalabha-0.1.1.jar,</em> a bottled up version of the Scalabha code that others could use in their own libraries. The extension “jar” stands for <strong>J</strong>ava <strong>Ar</strong>chive, and it is basically just a zipped up collection of a bunch of class files.</p>
<p>Scalabha itself uses another of supporting libraries produced by others. To see the jars that are used as supporting libraries by Scalabha, do the following.</p>
<p><pre class="brush: bash;">
$ ls lib_managed/jars/*/*/*.jar
lib_managed/jars/jline/jline/jline-0.9.94.jar
lib_managed/jars/junit/junit/junit-3.8.1.jar
lib_managed/jars/org.apache.commons/commons-lang3/commons-lang3-3.0.1.jar
lib_managed/jars/org.clapper/argot_2.9.1/argot_2.9.1-0.3.5.jar
lib_managed/jars/org.clapper/grizzled-scala_2.9.1/grizzled-scala_2.9.1-1.0.8.jar
lib_managed/jars/org.scalatest/scalatest_2.9.0/scalatest_2.9.0-1.6.1.jar
</pre></p>
<p>Of course, you may still be wondering what it means to &#8220;use a library&#8221; in your code. More on this after we talk about packages and actually start doing some code ourselves.</p>
<h2>Packages</h2>
<p>Projects with a lot of code are generally organized into a package that has a set of sub-packages for parts of the code base that work closely together. At the very high level, a package is simply a way to ensure that we have unique fully qualified names for classes. For example, there is a class called <strong>Range</strong> in the Apache Commons Lang library and in the core Scala library. If you want to use both of these classes in the same piece of code, there is an obvious problem of a name conflict. Fortunately, they are contained within packages that allow us to refer to them uniquely.</p>
<ul>
<li><strong>Range</strong> in the Apache Commons Lang library is <strong>org.apache.commons.lang3.Range</strong></li>
<li><strong>Range</strong> in Scala is <strong>scala.collection.immutable.Range</strong></li>
</ul>
<p>So, when we do need to use them together, we are still able to do so without conflict. You’ve actually already seen some package names before, for example with <strong>java.lang.String</strong> and the distinction between <strong>scala.collection.mutable.Map</strong> and <strong>scala.collection.immutable.Map</strong>.</p>
<p>To see the packages and classes in Scalabha, run the “<em>doc</em>” action in SBT.</p>
<p><pre class="brush: bash;">
&gt; doc
[info] Generating API documentation for main sources...
model contains 35 documentable templates
[info] API documentation generation successful.
[success] Total time: 7 s, completed Oct 26, 2011 10:22:23 AM
</pre></p>
<p>Now, point your browser to the file <em>target/api/index.html</em>. Note: this means doing “open file” and then going to your <em>SCALABHA_DIR</em> and then to <strong>target</strong>, then to <em>api</em>, and then selecting<em> index.html</em>. You can then browse the packages and classes in Scalabha. For example, look at <strong>HmmTagger</strong>, which is in the package <strong>opennlp.scalabha.postag</strong>, and you’ll see some of the fields and functions that are made available by that class.</p>
<p>But, you may still be wondering: how do I use these packages and classes in my code anyway? We do so via <em>import</em> statements. We’ll explore this by creating our own source code and compiling it.</p>
<h2>Creating and compiling new code in SBT</h2>
<p>First, we’ll begin by just doing a simple hello world application that is done in the context of Scalabha and uses a package name. Get set up for this by doing the following set of commands.</p>
<p>Now, point your browser to the file <strong>target/api/index.html</strong>. Note: this means doing &#8220;open file&#8221; and then going to your SCALABHA_DIR and then to target, then to api, and then selecting index.html. You can then browse the packages and classes in Scalabha. For example, look at HmmTagger, which is in the package opennlp.scalabha.postag, and you&#8217;ll see some of the fields and functions that are made available by that class.</p>
<p>But, you may still be wondering: how do I use these packages and classes in my code anyway? We do so via import statements. We&#8217;ll explore this by creating our own source code and compiling it.</p>
<h2>Creating and compiling new code in SBT</h2>
<p>First, we&#8217;ll begin by just doing a simple hello world application that is done in the context of Scalabha and uses a package name. Get set up for this by doing the following set of commands.</p>
<p><pre class="brush: bash;">
$ cd $SCALABHA_DIR
$ cd src/main/scala/opennlp/
$ mkdir bcomposes
</pre></p>
<p>Next, using a text editor, create the file <strong>Hello.scala</strong> in the <em>src/main/scala/opennlp/bcomposes</em> directory with the following contents.</p>
<p><pre class="brush: scala;">
package opennlp.bcomposes

object Hello {
  def main (args: Array[String]) = println(&quot;Hello, world!&quot;)
}
</pre></p>
<p>This is just like the hello world object from the previous tutorial, but now it has the additional package specification that indicates that its fully qualified name is <strong>opennlp.bcomposes.Hello</strong>.</p>
<p>Because the source code for <em>Hello.scala</em> is in a sub-directory of the <em>src/main/scala</em> directory, we can now compile this file using SBT. Make sure to save <em>Hello.scala</em>, and then go back to your SBT prompt and type “<em>compile</em>“.</p>
<p><pre class="brush: bash;">
&gt; compile
[info] Compiling 1 Scala source to /Users/jbaldrid/devel/scalabha/target/classes...
[success] Total time: 1 s, completed Oct 26, 2011 10:35:15 AM
</pre></p>
<p>Notice that it compiled just one Scala source: SBT has already compiled the other source files in Scalabha, so it only had to compile the new one that you just saved.</p>
<p>Having successfully created <strong>and</strong> compiled the <strong>opennlp.bcomposes.Hello</strong> object, we can now run it. The scalabha executable provides a “<em>run</em>” target that allows you to run any of the code you’ve produced in the Scalabha build setup. In your shell, type the following.</p>
<p><pre class="brush: bash;">
$ scalabha run opennlp.bcomposes.Hello
Hello, world!
</pre></p>
<p>There is actually a bunch of stuff going on under the hood that ensures that your new class is included in the <em>CLASSPATH</em> and can be used in this manner (see <em>bin/scalabha</em> for details). This will simplify things for you considerable. To make a long story short, getting the <em>CLASSPATH</em> appropriately set is one of the main points of confusion for new developers; this way you can keep on moving without having to worry about what is essentially a plumbing problem.</p>
<p>Now, let’s say you want to change the definition of the <strong>Hello</strong> object to also print out an additional message that is supplied on the command line. Modify the <strong>main</strong> method to look like this.</p>
<p><pre class="brush: scala;">
def main (args: Array[String]) {
  println(&quot;Hello, world!&quot;)
  println(args(0))
}
</pre></p>
<p>Now save it, and try running it.</p>
<p><pre class="brush: bash;">
$ scalabha run opennlp.bcomposes.Hello Goodbye
Hello, world!
</pre></p>
<p>Oops — it didn’t work?! I’ve just forced you directly into a common point of confusion for students who are switching from scripting to compiling: you must compile before it can be used. So, invoke <em>compile</em> in SBT, and then try that command again.</p>
<p><pre class="brush: bash;">
$ scalabha run opennlp.bcomposes.Hello Goodbye
Hello, world!
Goodbye
</pre></p>
<p>To see what happens when you produce a syntax error in your Scala code, go back to <em>Hello.scala</em> and change first print statement in the <strong>main</strong> method so that it is missing the last quote:</p>
<p><pre class="brush: scala;">
println(&quot;Hello, world!)
</pre></p>
<p>Now go back to SBT and compile again to see the love letter you get from the Scala compiler.</p>
<p><pre class="brush: bash;">
[info] Compiling 1 Scala source to /Users/jbaldrid/devel/scalabha/target/classes...
[error] /Users/jbaldrid/devel/scalabha/src/main/scala/opennlp/bcomposes/Hello.scala:5: unclosed string literal
[error]     println(&quot;Hello, world!)
[error]             ^
[error] /Users/jbaldrid/devel/scalabha/src/main/scala/opennlp/bcomposes/Hello.scala:7: ')' expected but '}' found.
[error]   }
[error]   ^
[error] two errors found
[error] {file:/Users/jbaldrid/devel/scalabha/}default-86efd0/compile:compile: Compilation failed
[error] Total time: 0 s, completed Oct 26, 2011 11:02:07 AM
</pre></p>
<p>The compile attempt failed, and you must go back and fix it. But don’t do that yet. There’s a handy aspect of SBT in this <em>write-save-compile</em> loop that saves you time and effort: SBT allows <em>triggered</em> executation of actions, which means that SBT can automatically perform an action if there is a change to the stuff it cares about. The <em>compile</em> action cares about the source code, so it can monitor changes in the file system and automatically recompile any time a file is saved. To do this, you simply add <strong>~</strong> in front of the action.</p>
<p>Before fixing the error, type <strong>~compile</strong> into SBT. You’ll see the same error message as before, but don’t worry about that. The last line of output from SBT will say:</p>
<p><pre class="brush: bash;">
1. Waiting for source changes... (press enter to interrupt)
</pre></p>
<p>Now go to <em>Hello.scala</em> again, add the quote back in, and save the file. This triggers the compile action in SBT, so you’ll see it automatically compile, with a success message.</p>
<p><pre class="brush: bash;">
[info] Compiling 1 Scala source to /Users/jbaldrid/devel/scalabha/target/classes...
[success] Total time: 0 s, completed Oct 26, 2011 11:02:49 AM
2. Waiting for source changes... (press enter to interrupt)
</pre></p>
<p>This is a nice way to see if your code is compiling as you work on it, with very little effort. Every time you save the file, it will let you know if there are problems. And, you’ll also be able to use the <em>scalabha</em> <em>run</em> target and know that you are using the latest compiled version when you do so.</p>
<p>As you develop your code in this way, you can invoke the “<em>doc</em>” action in SBT, then reload the <em>index.html</em> page in your browser, and it will show you the updated documentation for the things you’ve created. Try it now and look at the <strong>opennlp.bcomposes</strong> package that you’ve now created.</p>
<h2>Creating code that uses existing packages</h2>
<p>Now we can come back to using code from existing packages. In the past (if you’ve gone through all of these tutorials), you’ve seen statements like <strong>import scala.io.Source</strong>. That came from the standard Scala library, so it is always available to any Scala program. However, you can also use classes developed by others in a similar manner, provided your <em>CLASSPATH</em> is set up such that they are available. That is exactly what SBT does for you: all of the classes that are defined in the <em>src/main/scala</em> sub-directories are ready for your use.</p>
<p>As an example, save the following code as <em>src/main/scala/opennlp/bcomposes/TreeTest.scala</em>. It constructs a standard phrase structure tree for the sentence “I like coffee.”</p>
<p><pre class="brush: scala;">
package opennlp.bcomposes

import opennlp.scalabha.model.{Node,Value}

object TreeTest {

  def main (args: Array[String]) {
    val leaf1 = Value(&quot;I&quot;)
    val leaf2 = Value(&quot;like&quot;)
    val leaf3 = Value(&quot;coffee&quot;)
    val subjNpNode = Node(&quot;NP&quot;, List(leaf1))
    val verbNode = Node(&quot;V&quot;, List(leaf2))
    val objNpNode = Node(&quot;NP&quot;, List(leaf3))
    val vpNode = Node(&quot;VP&quot;, List(verbNode, objNpNode))
    val sentenceNode = Node(&quot;S&quot;, List(subjNpNode, vpNode))

    println(&quot;Printing the full tree:\n&quot; + sentenceNode)
    println(&quot;\nPrinting the children of the VP node:\n&quot; + vpNode.children)

    println(&quot;\nPrinting the yield of the full tree:\n&quot; + sentenceNode.getTokens.mkString(&quot; &quot;))
    println(&quot;\nPrinting the yield of the VP node:\n&quot; + vpNode.getTokens.mkString(&quot; &quot;))
  }

}
</pre></p>
<p>There are a few things to note here. The <em>import</em> statement at the top is what tells Scala the fully qualified package names for the classes <strong>Node</strong> and <strong>Value</strong>. You could have equivalently written it less concisely as follows.</p>
<p><pre class="brush: scala;">
import opennlp.scalabha.model.Node
import opennlp.scalabha.model.Value
</pre></p>
<p>Or, you could have left out the <em>import</em> statement and written the fully qualified names everywhere, e.g.:</p>
<p><pre class="brush: scala;">
val leaf1 = opennlp.scalabha.model.Value(&quot;I&quot;)
</pre></p>
<p>Second, <strong>Node</strong> and <strong>Value</strong> are <em>case</em> classes. We’ll discus this more later, but for now, all you need to know is that to create an object of the <strong>Node</strong> or <strong>Value</strong> classes, it isn’t necessary to use the “<em>new</em>” keyword.</p>
<p>Third, the <em>print</em> statements are using the Scalabha API (Application Programming Interface) to do useful things with the objects, such as printing out the tree they describe, printing the yield of the nodes (the words that they cover), and so on. The scaladoc you looked at before for Scalabha shows you these functions, so go have a look if you haven’t already.</p>
<p>Note that if you had left the triggered compilation on, SBT will have automatically compiled the <em>TreeTest.scala</em>. Otherwise, make sure to compile it yourself. Then, run it.</p>
<p><pre class="brush: bash;">
$ scalabha run opennlp.bcomposes.TreeTest
Printing the full tree:
Node(S,List(Node(NP,List(Value(I))), Node(VP,List(Node(V,List(Value(like))), Node(NP,List(Value(coffee)))))))

Printing the children of the VP node:
List(Node(V,List(Value(like))), Node(NP,List(Value(coffee))))

Printing the yield of the full tree:
I like coffee

Printing the yield of the VP node:
like coffee
</pre></p>
<h2>Make and use your own package</h2>
<p>By importing the classes you need in this manner, you can get more done by using them as you need. Any class in Scalabha or in the libraries that are included with it will be available for you, including any classes you define. As an example, do the following.</p>
<p><pre class="brush: bash;">
$ cd $SCALABHA_DIR/src/main/scala/opennlp/bcomposes
$ mkdir person
$ mkdir music
</pre></p>
<p>Now save the <strong>Person</strong> class from the previous tutorial as <em>Person.scala</em> in the <em>person</em> directory. Here’s the code again (note the addition of the <em>package</em> statement).</p>
<p><pre class="brush: scala;">
package opennlp.bcomposes.person

class Person (
  val firstName: String,
  val lastName: String,
  val age: Int,
  val occupation: String
) {

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}
</pre></p>
<p>Now save the following as <em>RadioheadGreeting.scala</em> in the <em>music</em> directory.</p>
<p><pre class="brush: scala;">
package opennlp.bcomposes.music

import opennlp.bcomposes.person.Person

object RadioheadGreeting {

  def main (args: Array[String]) {
    val thomYorke = new Person(&quot;Thom&quot;, &quot;Yorke&quot;, 43, &quot;musician&quot;)
    val johnnyGreenwood = new Person(&quot;Johnny&quot;, &quot;Greenwood&quot;, 39, &quot;musician&quot;)
    val colinGreenwood = new Person(&quot;Colin&quot;, &quot;Greenwood&quot;, 41, &quot;musician&quot;)
    val edObrien = new Person(&quot;Ed&quot;, &quot;O'Brien&quot;, 42, &quot;musician&quot;)
    val philSelway = new Person(&quot;Phil&quot;, &quot;Selway&quot;, 44, &quot;musician&quot;)
    val radiohead = List(thomYorke, johnnyGreenwood, colinGreenwood, edObrien, philSelway)
    radiohead.foreach(bandmember =&gt; println(bandmember.greet(false)))
  }

}
</pre></p>
<p>When we did <a title="First steps in Scala for beginning programmers, Part 10" href="../2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/">the compilation tutorial</a> previously, <em>Person.scala</em> and <em>RadioheadGreeting.scala</em> were in the same directory, which allowed the latter to know about the <strong>Person</strong> class. Now that they are in separate packages, the <strong>Person</strong> class must be explicitly imported; once you’ve done so, you can code with <strong>Person</strong> objects just as you did before.</p>
<p>Finally, to run it, we now must specify the fully qualified package name for <strong>RadioheadGreeting</strong>.</p>
<p><pre class="brush: bash;">
$ scalabha run opennlp.bcomposes.music.RadioheadGreeting
Hi, I'm Thom!
Hi, I'm Johnny!
Hi, I'm Colin!
Hi, I'm Ed!
Hi, I'm Phil!
</pre></p>
<h2>A note on package names and their relation to directories</h2>
<p>Package names are made unique by certain conventions that generally ensure you won’t get clashes. For example, we are using<strong> opennlp.scalabha</strong> and <strong>opennlp.bcomposes</strong>, which I happen to know are unique. Quite often these names will include full internet domains, in reverse, like <strong>org.apache.commons</strong> and <strong>com.cloudera.crunch</strong>. By convention, we put the source files that are in packages (and subpackages) in directory structures that reflect the names. So, for example, <strong>opennlp.bcomposes.music.RadioheadGreeting</strong> is in the directory <em>src/main/scala/opennlp/bcomposes/music</em>. However, it is worth noting that this is not a hard constraint with Scala (as it is with Java).</p>
<p>There is a great deal more to using a build system, but this is where I must end this discussion, hoping it is enough to get the core concepts across and make it possible for my students to do <a href="http://icl-f11.utcompling.com/assignments/hw4-part-of-speech-tagging">the homework on part-of-speech tagging</a> and making use of the <strong>opennlp.scalabha.postag</strong> package!</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<p>&nbsp;</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/183/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=183&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/10/26/first-steps-in-scala-for-beginning-programmers-part-11/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 10</title>
		<link>https://bcomposes.wordpress.com/2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/</link>
		<comments>https://bcomposes.wordpress.com/2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/#comments</comments>
		<pubDate>Wed, 26 Oct 2011 03:29:10 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=174</guid>
		<description><![CDATA[Topics: scripting, compiling, main methods, return values of functions Preface This is part 10 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. The tutorials up to &#8230;<p><a href="https://bcomposes.wordpress.com/2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=174&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>scripting, compiling, main methods, return values of functions</em></p>
<h2>Preface</h2>
<p>This is part 10 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.<br />
The tutorials up to this point have been based on working with the Scala REPL or running basic scripts that are run from the command line. The latter is called &#8220;scripting&#8221; and usually is done for fairly simple, self-contained coding tasks. For more involved tasks that require a number of different modules and accessing libraries produced by others, it is necessary to work with a build system that brings together your code, others&#8217; code, allows you to compile it, test it, and package it so that you can use it as an application.</p>
<p>This tutorial takes you from running Scala scripts to compiling Scala programs to create byte code that can be shared by different applications. This will act as a bridge to set you up for the next step of using a build system. Along the way, some points will be made about objects, extending on some of the ideas from the previous tutorial about object-oriented programming. At a high level, the relevance of objects to a larger, modularized code base should be pretty clear: objects encapsulate data and functions that can be used by other objects, and we need to be able to organize them so that objects know how to find other objects and class definitions. Build systems, which we&#8217;ll look at in the next tutorial, will make this straightforward.</p>
<h2>Running Scala scripts</h2>
<p>In the beginning, you started with the REPL.</p>
<p><pre class="brush: scala;">
scala&gt; println(&quot;Hello, World!&quot;)
Hello, World!
</pre></p>
<p>Of course, the REPL is just a (very useful) playground for trying out snippets of Scala code, not for doing real work. So, you saw that you could put code like <strong>println(&#8220;Hello, World!&#8221;)</strong> into a file called <strong>Hello.scala</strong> and run it from the command line.</p>
<p><pre class="brush: bash;">
$ scala Hello.scala
Hello, World!
</pre></p>
<p>The homeworks and tutorials done so far have worked in this way, though they are a bit more complex. We can even include class definitions and objects created from a class. For example, using the <strong>Person</strong> class from <a title="First steps in Scala for beginning programmers, Part 9" href="http://bcomposes.wordpress.com/2011/10/24/first-steps-in-scala-for-beginning-programmers-part-9/">the previous tutorial</a>, we can put all the code into a file called <strong>People.scala</strong> (btw, this name doesn&#8217;t matter &#8212; could as well be <strong>Blurglecruncheon.scala</strong>).</p>
<p><pre class="brush: scala;">
class Person (
  val firstName: String,
  val lastName: String,
  val age: Int,
  val occupation: String
) {

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}

val johnSmith = new Person(&quot;John&quot;, &quot;Smith&quot;, 37, &quot;linguist&quot;)
val janeDoe = new Person(&quot;Jane&quot;, &quot;Doe&quot;, 34, &quot;computer scientist&quot;)
val johnDoe = new Person(&quot;John&quot;, &quot;Doe&quot;, 43, &quot;philosopher&quot;)
val johnBrown = new Person(&quot;John&quot;, &quot;Brown&quot;, 28, &quot;mathematician&quot;)

val people = List(johnSmith, janeDoe, johnDoe, johnBrown)
people.foreach(person =&gt; println(person.greet(true)))
</pre></p>
<p>This can now be run from the command line, producing the expected result.</p>
<p><pre class="brush: bash;">
$ scala People.scala
Hello, my name is John Smith. I'm a linguist.
Hello, my name is Jane Doe. I'm a computer scientist.
Hello, my name is John Doe. I'm a philosopher.
Hello, my name is John Brown. I'm a mathematician.
</pre></p>
<p>However, suppose you wanted to use the <strong>Person</strong> class from a different application (e.g. that is defined in a different file). You might think you could save the following in the file <strong>Radiohead.scala</strong>, and then run it with Scala.</p>
<p><pre class="brush: scala;">
val thomYorke = new Person(&quot;Thom&quot;, &quot;Yorke&quot;, 43, &quot;musician&quot;)
val johnnyGreenwood = new Person(&quot;Johnny&quot;, &quot;Greenwood&quot;, 39, &quot;musician&quot;)
val colinGreenwood = new Person(&quot;Colin&quot;, &quot;Greenwood&quot;, 41, &quot;musician&quot;)
val edObrien = new Person(&quot;Ed&quot;, &quot;O'Brien&quot;, 42, &quot;musician&quot;)
val philSelway = new Person(&quot;Phil&quot;, &quot;Selway&quot;, 44, &quot;musician&quot;)
val radiohead = List(thomYorke, johnnyGreenwood, colinGreenwood, edObrien, philSelway)
radiohead.foreach(bandmember =&gt; println(bandmember.greet(false)))
</pre></p>
<p>However, if you do &#8220;<strong>scala Radiohead.scala</strong>&#8221; you&#8217;ll see five errors, each one complaining that the type <strong>Person</strong> wasn&#8217;t found. How could <strong>Radiohead.scala</strong> know about the Person class and where to find its definition? I&#8217;m not aware of a way to do this with scripting-style Scala programming, and even though I suspect there may be a way to do something this simple, I don&#8217;t even care to know it. Let&#8217;s just get straight to compiling.</p>
<h2>Compiling</h2>
<p>The usual thing we do with Scala is to compile our programs to byte code. We won&#8217;t go into the details of that, but it basically means that Scala turns the text of a Scala program into a compiled set of machine instructions that can be interpreted by your operating system. (It actually compiles to Java byte code, which is one reason it is pretty straightforward to use Java code when coding in Scala.)</p>
<p>So, what does compilation look like? We need to start by changing the code we did above a bit. Make a directory that has nothing in it, say <strong>/tmp/tutorial</strong>. Then save the following as <strong>PersonApp.scala</strong> in that directory.</p>
<p><pre class="brush: scala;">
class Person (
  val firstName: String,
  val lastName: String,
  val age: Int,
  val occupation: String
) {

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}

object PersonApp {

  def main (args: Array[String]) {
    val johnSmith = new Person(&quot;John&quot;, &quot;Smith&quot;, 37, &quot;linguist&quot;)
    val janeDoe = new Person(&quot;Jane&quot;, &quot;Doe&quot;, 34, &quot;computer scientist&quot;)
    val johnDoe = new Person(&quot;John&quot;, &quot;Doe&quot;, 43, &quot;philosopher&quot;)
    val johnBrown = new Person(&quot;John&quot;, &quot;Brown&quot;, 28, &quot;mathematician&quot;)

    val people = List(johnSmith, janeDoe, johnDoe, johnBrown)
    people.foreach(person =&gt; println(person.greet(true)))
  }

}
</pre></p>
<p>Notice that the code looks pretty similar to the script above, but now we have a <strong>PersonApp</strong> object with a <strong>main</strong> method. The <strong>main</strong> method contains all the stuff that the original script had after the <strong>Person</strong> definition. Notice also that there is an <strong>args</strong> argument to the <strong>main</strong> method, which should look familiar now. What you are seeing is that a Scala script is basically just a simplified view of an object with a <strong>main</strong> method. Such scripts use the convention that the <strong>Array[String]</strong> provided to the method is called <strong>args</strong>.</p>
<p>Okay, so now consider what happens if you run &#8220;<strong>scala PersonApp.scala</strong>&#8221; &#8212; nothing at all. That&#8217;s because there is no executable code available outside of the object and class definitions. Instead, we need to compile the code and then run the <strong>main</strong> method of specific objects. The next step is to run <strong>scalac</strong> (N.B. &#8220;scala<strong>c</strong>&#8221; with a &#8220;c&#8221;, not &#8220;scala&#8221;) on <strong>PersonApp.scala</strong>. The name scalac is short for <span style="text-decoration:underline;">Scala c</span>ompiler. Do the following steps in the <strong>/tmp/tutorial</strong> directory.</p>
<p><pre class="brush: bash;">
$ scalac PersonApp.scala
$ ls
Person.class                    PersonApp.class
PersonApp$$anonfun$main$1.class PersonApp.scala
PersonApp$.class
</pre></p>
<p>Notice that a number of <strong>*.class</strong> files have been generated. These are byte code files that the scala application is able to run. A nice thing here is that it all the compilation is done: when in the past you ran &#8220;scala&#8221; on your programs (scripts), it had to first compile the instructions and then run the program. Now we are separating these steps into a compilation phase and a running phase.</p>
<p>Having generated the class files, we can run any object that has a main method, like <strong>PersonApp</strong>.</p>
<p><pre class="brush: bash;">
$ scala PersonApp
Hello, my name is John Smith. I'm a linguist.
Hello, my name is Jane Doe. I'm a computer scientist.
Hello, my name is John Doe. I'm a philosopher.
Hello, my name is John Brown. I'm a mathematician.
</pre></p>
<p>Try running &#8220;<strong>scala Person</strong>&#8221; to see the error message it gives you.</p>
<p>Next, move the <strong>Radiohead.scala</strong> script that you saved earlier into this directory and run it.</p>
<p><pre class="brush: bash;">
$ scala Radiohead.scala
Hi, I'm Thom!
Hi, I'm Johnny!
Hi, I'm Colin!
Hi, I'm Ed!
Hi, I'm Phil!
</pre></p>
<p>This is the same script, but now it is in a directory that contains the <strong>Person.class</strong> file, which tells Scala everything that <strong>Radiohead.scala</strong> needs to construct objects of the <strong>Person</strong> class. Scala makes available any class file that is defined in the <em>CLASSPATH</em>, an environment variable that by default includes the current working directory.</p>
<p>Despite this success, we&#8217;re going away from script land with this post, so change the contents of <strong>Radiohead.scala</strong> to be the following.</p>
<p><pre class="brush: scala;">
object RadioheadGreeting {

  def main (args: Array[String]) {
    val thomYorke = new Person(&quot;Thom&quot;, &quot;Yorke&quot;, 43, &quot;musician&quot;)
    val johnnyGreenwood = new Person(&quot;Johnny&quot;, &quot;Greenwood&quot;, 39, &quot;musician&quot;)
    val colinGreenwood = new Person(&quot;Colin&quot;, &quot;Greenwood&quot;, 41, &quot;musician&quot;)
    val edObrien = new Person(&quot;Ed&quot;, &quot;O'Brien&quot;, 42, &quot;musician&quot;)
    val philSelway = new Person(&quot;Phil&quot;, &quot;Selway&quot;, 44, &quot;musician&quot;)
    val radiohead = List(thomYorke, johnnyGreenwood, colinGreenwood, edObrien, philSelway)
    radiohead.foreach(bandmember =&gt; println(bandmember.greet(false)))
  }

}
</pre></p>
<p>Then run scalac on all of the <strong>*.scala</strong> files in the directory. There are now more class files, corresponding to the <strong>RadioheadGreeting</strong> object we defined.</p>
<p><pre class="brush: bash;">
$ scalac *.scala
$ ls
Person.class                            Radiohead.scala
PersonApp$$anonfun$main$1.class         RadioheadGreeting$$anonfun$main$1.class
PersonApp$.class                        RadioheadGreeting$.class
PersonApp.class                         RadioheadGreeting.class
PersonApp.scala
</pre></p>
<p>You can now run &#8220;<strong>scala RadioheadGreeting</strong>&#8221; to get the greeting from the band members. Notice that the file <strong>RadioheadGreeting</strong> was saved in was called <strong>Radiohead.scala</strong> and that no class files were generated called <strong>Radiohead.class</strong>, etc. Again, the file name could have been named something entirely different, like <strong>Turlingdrome.scala</strong>. (<a href="http://en.wikipedia.org/wiki/Vogon#Poetry">Embrace your inner Vogon.</a>)</p>
<h2>Multiple objects in the same file</h2>
<p>There is no problem having multiple objects with <strong>main</strong> methods in the same file. When you compile the file with <strong>scalac</strong>, each object generates its own set of class files, and you call <strong>scala</strong> on whichever class file contains the definition for the <strong>main</strong> method you want to run. As an example, save the following as <strong>Greetings.scala</strong>.</p>
<p><pre class="brush: scala;">
object Hello {
  def main (args: Array[String]) {
    println(&quot;Hello, world!&quot;)
  }
}

object Goodbye {
  def main (args: Array[String]) {
    println(&quot;Goodbye, world!&quot;)
  }
}

object SayIt {
  def main (args: Array[String]) {
    args.foreach(println)
  }
}
</pre></p>
<p>Next compile the file and then you can run any of the generated class files (since they all have <strong>main</strong> methods).</p>
<p><pre class="brush: bash;">
$ scalac Greetings.scala
$ scala Hello
Hello, world!
$ scala Goodbye
Goodbye, world!
$ scala Goodbye many useless arguments
Goodbye, world!
$ scala SayIt &quot;Oh freddled gruntbuggly&quot; &quot;thy micturations are to me&quot; &quot;As plurdled gabbleblotchits on a lurgid bee.&quot;
Oh freddled gruntbuggly
thy micturations are to me
As plurdled gabbleblotchits on a lurgid bee.
</pre></p>
<p>In case you missed it earlier, the <strong>args</strong> array is where the command line arguments go and you can thus make use of them (or not, as in the case of the <strong>Hello</strong> and <strong>Goodbye</strong> objects).</p>
<h2>Functions with return values versus those without</h2>
<p>Some functions return a value while others do not. As a simple example, consider the following pairs of functions.</p>
<p><pre class="brush: scala;">
scala&gt; def plusOne (x: Int) = x+1
plusOne: (x: Int)Int

scala&gt; def printPlusOne (x: Int) = println(x+1)
printPlusOne: (x: Int)Unit
</pre></p>
<p>The first takes an <strong>Int</strong> argument and returns an <strong>Int</strong>, which is a value. The other takes an <strong>Int</strong> and returns <strong>Unit</strong>, which is to say it doesn&#8217;t return a value. Notice the difference in behavior between the two following uses of the functions.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = plusOne(2)
foo: Int = 3

scala&gt; val bar = printPlusOne(2)
3
bar: Unit = ()
</pre></p>
<p>Scala uses a slightly subtle distinction in function definitions that can distinguish functions that return values versus those that return <strong>Unit</strong> (no value): If you don&#8217;t use an equals sign in the definition, it means that the function returns <strong>Unit</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; def plusOneNoEquals (x: Int) { x+1 }
plusOneNoEquals: (x: Int)Unit

scala&gt; def printPlusOneNoEquals (x: Int) { println(x+1) }
printPlusOneNoEquals: (x: Int)Unit
</pre></p>
<p>Notice that the above definition of <strong>plusOneNoEquals</strong> returns <strong>Unit</strong>, even though it looks almost identical to <strong>plusOne</strong> defined earlier. Check it out.</p>
<p><pre class="brush: scala;">
scala&gt; val foo = plusOneNoEquals(2)
foo: Unit = ()
</pre></p>
<p>Now look back at the <strong>main</strong> methods given earlier. No equals. Yep, they don&#8217;t have a return value. They are the entry point into your code, and any effects of running the code must be output to the console (e.g. with <strong>println</strong> or via a GUI) or written to the file system (or the internet somewhere). The outputs of such functions (ones which do not return a value) are called side-effects. You need them for the main methods. However, in many styles of programming, a great deal of work is done with side-effects. I&#8217;ve been trying to gently lead the readers of this tutorial to adopt a more functional approach that tries to avoid them. I&#8217;ve found it a more effective style myself in my own coding, so I&#8217;m hoping it will serve you all better to start from that point. (Note that Scala supports many styles of programming, which is nice because you have choice and can go with what you find most suitable.)</p>
<h2>Cleaning up</h2>
<p>You may have noticed that the directory you are working in as you run <strong>scalac</strong> on your scala files becomes quite littered with class files. For example, here&#8217;s what the state of the code directory worked with in this tutorial looks like after compiling all files.</p>
<p><pre class="brush: bash;">
$ ls
Goodbye$.class                          PersonApp.scala
Goodbye.class                           Radiohead.scala
Greetings.scala                         RadioheadGreeting$$anonfun$main$1.class
Hello$.class                            RadioheadGreeting$.class
Hello.class                             RadioheadGreeting.class
Person.class                            SayIt$$anonfun$main$1.class
PersonApp$$anonfun$main$1.class         SayIt$.class
PersonApp$.class                        SayIt.class
PersonApp.class
</pre></p>
<p>A mess, right? Generally, one would rarely develop a Scala application by compiling it directly in this way. Instead a build system is used to manage the compilation process, organize the files, and allow one to easily access software libraries created by other developers. The next tutorial will cover this, using <a href="https://github.com/harrah/xsbt/wiki">SBT </a>(the Simple Build Tool).</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/174/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=174&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/10/25/first-steps-in-scala-for-beginning-programmers-part-10/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 9</title>
		<link>https://bcomposes.wordpress.com/2011/10/24/first-steps-in-scala-for-beginning-programmers-part-9/</link>
		<comments>https://bcomposes.wordpress.com/2011/10/24/first-steps-in-scala-for-beginning-programmers-part-9/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 00:51:18 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=162</guid>
		<description><![CDATA[Topics: objects, classes, inheritance, traits, Lists with multiple related types, apply Preface This is part 9 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. This tutorial &#8230;<p><a href="https://bcomposes.wordpress.com/2011/10/24/first-steps-in-scala-for-beginning-programmers-part-9/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=162&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>objects, classes, inheritance, traits, Lists with multiple related types, apply</em></p>
<h2>Preface</h2>
<p>This is part 9 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.</p>
<p>This tutorial is about object-oriented programming with Scala. Most of what we&#8217;ve seen so far has been programming with functions and using basic types, like Int, Double, and String, and with predefined types like List and Map. As it turns out, these are all classes, or types of Scala data structures that allow one to create objects, or instances of the type. This tutorial will not give a broad introduction to object-oriented programming, but it will give some practical examples of classes and objects and how to use them. I apologize in advance for some sloppiness in the presentation of object-oriented concepts; the intent is to get across the ideas for beginners mainly through intuitive examples without being mired in lots of technical details. See <a href="http://en.wikipedia.org/wiki/Object-oriented_programming">the Wikipedia page on object-oriented programming</a> for more detail.</p>
<p>Note that the definitions of objects and classes in this tutorial are most easily viewed as plain text, out of the REPL. So, I&#8217;ll put a piece of code into the text, and you should add it to your own REPL (by simply cutting and pasting) in order to be able to follow along.</p>
<h2>Objects</h2>
<p>At its core, an object can be thought of as a structure that encapsulates some data and functions. Let&#8217;s start with an an example of an object representing a person and some of their possible attributes.</p>
<p><pre class="brush: scala;">
object JohnSmith {
  val firstName = &quot;John&quot;
  val lastName = &quot;Smith&quot;
  val age = 37
  val occupation = &quot;linguist&quot;

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}
</pre></p>
<p>If you put this into the Scala REPL, you&#8217;ll be able to access the fields (<strong>firstName</strong>, <strong>lastName</strong>, <strong>age</strong>, and <strong>occupation</strong>) and the functions (<strong>fullName</strong> and <strong>greet</strong>).</p>
<p><pre class="brush: scala;">
scala&gt; JohnSmith.firstName
res0: java.lang.String = John

scala&gt; JohnSmith.fullName
res1: String = John Smith

scala&gt; JohnSmith.greet(true)
res2: String = Hello, my name is John Smith. I'm a linguist.

scala&gt; JohnSmith.greet(false)
res3: String = Hi, I'm John!
</pre></p>
<p>So, at its most basic level, an object is just that: a collection of values and functions (also often called methods). You can access any of those values or functions by giving the name of the object followed by a period followed by the value or function you want to use. This can be useful for organizing such collections, but it also leads to many more possibilities, as we&#8217;ll see.</p>
<p>We might of course be interested in having the information about another person encapsulated in this way. We could do this by mimicking the definition for John Smith.</p>
<p><pre class="brush: scala;">
object JaneDoe {
  val firstName = &quot;Jane&quot;
  val lastName = &quot;Doe&quot;
  val age = 34
  val occupation = &quot;computer scientist&quot;

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}
</pre></p>
<p>After adding the above code to the REPL, now Jane Doe can greet us.</p>
<p><pre class="brush: scala;">
scala&gt; JaneDoe.greet(true)
res4: String = Hello, my name is Jane Doe. I'm a computer scientist.

scala&gt; JaneDoe.greet(false)
res5: String = Hi, I'm Jane!
</pre></p>
<p>Of course, I created the <strong>JaneDoe</strong> object by doing a copy-and-paste and then replacing the fields with Jane Doe&#8217;s information. This leads to a lot of wasted effort: the fields are the same, but the values are different, and the functions are completely identical. If you want to change something about the way greetings are made, you&#8217;d have to update it across all of the objects.</p>
<p>More importantly, these two objects are completely distinct from one another: one cannot put them in a list and map a function over that list. Consider the following failed attempt.</p>
<p><pre class="brush: scala;">
scala&gt; val people = List(JohnSmith, JaneDoe)
people: List[ScalaObject] = List(JohnSmith$@698fcb66, JaneDoe$@5f72cbae)

scala&gt; people.map(person =&gt; person.firstName)
&lt;console&gt;:11: error: value firstName is not a member of ScalaObject
people.map(person =&gt; person.firstName)
                                          ^
</pre></p>
<p>The only thing that Scala knowns about <strong>JohnSmith</strong> and <strong>JaneDoe</strong> is that they are <strong>ScalaObjects</strong>. That means that a list of such objects can basically just contain them and allow you to move them around as a group. So, something more is needed to make these collections more useful and more general.</p>
<h2>Classes</h2>
<p>With the list above, what we&#8217;d like to have is a <strong>List[Person]</strong>, where <strong>Person</strong> is a type that has known fields and functions. We can accomplish this by defining a <strong>Person</strong> class and then defining John and Jane as members of that class. This also reduces the cut-and-paste duplication problem noted earlier. Here&#8217;s what it looks like.</p>
<p><pre class="brush: scala;">
class Person (
  val firstName: String,
  val lastName: String,
  val age: Int,
  val occupation: String
) {

  def fullName: String = firstName + &quot; &quot; + lastName

  def greet (formal: Boolean): String = {
    if (formal)
      &quot;Hello, my name is &quot; + fullName + &quot;. I'm a &quot; + occupation + &quot;.&quot;
    else
      &quot;Hi, I'm &quot; + firstName + &quot;!&quot;
  }

}
</pre></p>
<p>The <em>class</em> keyword indicates that this is a class definition and <strong>Person</strong> is the name of the class. The next part of the definition is a set of parameters to the class that allow us to construct objects that are instances of the class &#8212; in other words, they are placeholders that allow us to use the <strong>Person</strong> class as a factory for creating <strong>Person</strong> objects. We do this by using the <em>new</em> keyword, giving the name of the class and supplying the values for each of the parameters. For example, here&#8217;s how we can create John Smith now.</p>
<p><pre class="brush: scala;">
scala&gt; val johnSmith = new Person(&quot;John&quot;, &quot;Smith&quot;, 37, &quot;linguist&quot;)
johnSmith: Person = Person@1979d4fb
</pre></p>
<p>Just as we could with the one-off standalone <strong>JohnSmith</strong> object previously, we can now access the fields and functions.</p>
<p><pre class="brush: scala;">
scala&gt; johnSmith.age
res8: Int = 37

scala&gt; johnSmith.greet(true)
res9: String = Hello, my name is John Smith. I'm a linguist.
</pre></p>
<p>Defining other people is now easy, and doesn&#8217;t require any cutting-and-pasting.</p>
<p><pre class="brush: scala;">
scala&gt; val janeDoe = new Person(&quot;Jane&quot;, &quot;Doe&quot;, 34, &quot;computer scientist&quot;)
janeDoe: Person = Person@7ff5376c

scala&gt; val johnDoe = new Person(&quot;John&quot;, &quot;Doe&quot;, 43, &quot;philosopher&quot;)
johnDoe: Person = Person@6544c984

scala&gt; val johnBrown = new Person(&quot;John&quot;, &quot;Brown&quot;, 28, &quot;mathematician&quot;)
johnBrown: Person = Person@4076a247
</pre></p>
<p>These <strong>Person</strong> objects can now be put into a list together, giving us a <strong>List[Person]</strong> that allows mapping to retrieve specific values, like first names and ages, and performing computations like calculating the average age of the individuals in the list.</p>
<p><pre class="brush: scala;">
scala&gt; val people = List(johnSmith, janeDoe, johnDoe, johnBrown)
people: List[Person] = List(Person@1979d4fb, Person@7ff5376c, Person@6544c984, Person@4076a247)

scala&gt; people.map(person =&gt; person.firstName)
res10: List[String] = List(John, Jane, John, John)

scala&gt; people.map(person =&gt; person.age)
res11: List[Int] = List(37, 34, 43, 28)

scala&gt; people.map(person =&gt; person.age).sum/people.length.toDouble
res12: Double = 35.5
</pre></p>
<p>We can sort them according to age.</p>
<p><pre class="brush: scala;">
scala&gt; val ageSortedPeople = people.sortBy(_.age)
ageSortedPeople: List[Person] = List(Person@4076a247, Person@7ff5376c, Person@1979d4fb, Person@6544c984)

scala&gt; ageSortedPeople.map(person =&gt; person.fullName + &quot;:&quot; + person.age)
res13: List[java.lang.String] = List(John Brown:28, Jane Doe:34, John Smith:37, John Doe:43)
</pre></p>
<p>We can also group people by first name, last name, etc.</p>
<p><pre class="brush: scala;">
scala&gt; people.groupBy(person =&gt; person.firstName)
res14: scala.collection.immutable.Map[String,List[Person]] = Map(Jane -&gt; List(Person@7ff5376c), John -&gt; List(Person@1979d4fb, Person@6544c984, Person@4076a247))

scala&gt; people.groupBy(person =&gt; person.lastName)
res15: scala.collection.immutable.Map[String,List[Person]] = Map(Brown -&gt; List(Person@4076a247), Smith -&gt; List(Person@1979d4fb), Doe -&gt; List(Person@7ff5376c, Person@6544c984))
</pre></p>
<p>With this, we can have all the Johns greet us.</p>
<p><pre class="brush: scala;">
scala&gt; people.groupBy(person =&gt; person.firstName)(&quot;John&quot;).foreach(john =&gt; println(john.greet(true)))
Hello, my name is John Smith. I'm a linguist.
Hello, my name is John Doe. I'm a philosopher.
Hello, my name is John Brown. I'm a mathematician.
</pre></p>
<h2>Standalone objects</h2>
<p>Above, we saw how to create instances of the <strong>Person</strong> class by using the <em>new</em> keyword and assigning the resulting object to a variable. We can come back full circle to the first <strong>JohnSmith</strong> object we created, which was a standalone <strong>ScalaObject</strong>. We can instead create such a standalone object by <em>extending</em> the <strong>Person</strong> class.</p>
<p><pre class="brush: scala;">
scala&gt; object ThomYorke extends Person(&quot;Thom&quot;, &quot;Yorke&quot;, 43, &quot;musician&quot;)
defined module ThomYorke

scala&gt; ThomYorke.greet(true)
res25: String = Hello, my name is Thom Yorke. I'm a musician.
</pre></p>
<p>By extending the Person class to create the object, we are saying that the object is a kind of <strong>Person</strong> &#8212; see more on inheritance below. So, <strong>ThomYorke</strong> is a <strong>Person</strong> object, like the others we created, but it is for a different use case that we&#8217;ll see more of in the next tutorial. For now, I&#8217;ll summarize, very roughly, by saying that the <strong>ThomYorke</strong> object can be made more accessible by other code that might be using my code, while the <strong>johnSmith</strong> and <strong>janeDoe</strong> objects are going to be more locally contained.</p>
<h2>Inheritance</h2>
<p>The standalone objects lead us naturally to the idea of inheritance. In many domains, there are natural hierachies of types, such that properties of a super type are inherited by its subtypes (e.g. fish have gills and swim, so salmon have gills and swim). For example, we could have a <strong>Linguist</strong> type that is a kind of <strong>Person</strong>, a <strong>ComputerScientist</strong> type that is a kind of <strong>Person</strong>, and so on. To model this, we create one class that extends another and possibly provides some additional parameters, such as the following definition of a <strong>Linguist</strong> sub-type of <strong>Person</strong>.</p>
<p><pre class="brush: scala;">
class Linguist (
  firstName: String,
  lastName: String,
  age: Int,
  val speciality: String,
  val favoriteLanguage: String
) extends Person(firstName, lastName, age, &quot;linguist&quot;) {

  def workGreeting =
    &quot;As a &quot; + occupation + &quot;, I am a &quot; + speciality + &quot; who likes to study the language &quot; + favoriteLanguage + &quot;.&quot;

}
</pre></p>
<p>The <strong>Linguist</strong> class has its own parameter list: some of these, like <strong>firstName</strong>, <strong>lastName</strong>, and <strong>age</strong>, are passed on to <strong>Person</strong>, and there are new parameter fields <strong>speciality</strong> and <strong>favoriteLanguage</strong>. The <em>extends</em> portion of the definition passes on the relevant parameters needed to construct all the information to make a <strong>Person</strong>, and for a <strong>Linguist</strong>, it directly sets the occupation parameter to be &#8220;linguist&#8221; &#8212; thus, we don&#8217;t need to provide that when we construct a <strong>Linguist</strong>, such as Noam Chomsky.</p>
<p><pre class="brush: scala;">
scala&gt; val noamChomsky = new Linguist(&quot;Noam&quot;, &quot;Chomsky&quot;, 83, &quot;syntactician&quot;, &quot;English&quot;)noamChomsky: Linguist = Linguist@54c0627f
</pre></p>
<p>Having defined a <strong>Linguist</strong> object in this way, we can ask it to give its work greeting.</p>
<p><pre class="brush: scala;">
scala&gt; noamChomsky.workGreeting
res26: java.lang.String = As a linguist, I am a syntactician who likes to study the language English.
</pre></p>
<p>We can also access fields and functions of <strong>Person</strong> objects, like <strong>age</strong> and <strong>greet</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; noamChomsky.age
res27: Int = 83

scala&gt; noamChomsky.greet(true)
res28: String = Hello, my name is Noam Chomsky. I'm a linguist.
</pre></p>
<p>Of course, the Linguist-specific fields like <strong>favoriteLanguage</strong> are accessible too.</p>
<p><pre class="brush: scala;">
scala&gt; noamChomsky.favoriteLanguage
res29: String = English
</pre></p>
<p>The observant reader will have noticed that some of the parameters are prefaced with <em>val</em> and others are not. We&#8217;ll get back to that distinction a bit later.</p>
<h2>Traits</h2>
<p>We could of course now go on to define a <strong>ComputerScientist</strong> class that would also have  <strong>workGreeting</strong> function, but the <strong>Linguist.workGreeting</strong> and <strong>ComputerScientist.workGreeting</strong> would be entirely separate. To enable this, we can use traits, which are like classes, but which define an interface of functions and fields that classes can supply concrete values and implementations for.  (Note: traits can also define concrete fields and functions, so they aren&#8217;t limited to placeholder functions as we show below.)</p>
<p>As an example, here&#8217;s a <strong>Worker</strong> trait, which simply defines a function <strong>workGreeting</strong> and declares that it must return a <strong>String</strong>.</p>
<p><pre class="brush: scala;">
trait Worker {
  def workGreeting: String
}
</pre></p>
<p>The <strong>Linguist</strong> class defined earlier already provides an implementation of that function. To allow a <strong>Linguist</strong> to be considered as a type of <strong>Worker</strong>, we add <em>with Worker</em> after extending <strong>Person</strong>.</p>
<p><pre class="brush: scala;">
class Linguist (
  firstName: String,
  lastName: String,
  age: Int,
  val speciality: String,
  val favoriteLanguage: String
) extends Person(firstName, lastName, age, &quot;linguist&quot;) with Worker {

  def workGreeting =
    &quot;As a &quot; + occupation + &quot;, I am a &quot; + speciality + &quot; who likes to study the language &quot; + favoriteLanguage + &quot;.&quot;

}
</pre></p>
<p>This is called &#8220;mixing in&#8221; the trait <strong>Worker</strong>, because the <strong>Linguist</strong> class mixes in the fields and functions of <strong>Worker</strong> with those of <strong>Person</strong>.</p>
<p>Note that we can also create classes that simply extend a trait like <strong>Worker</strong>.</p>
<p><pre class="brush: scala;">
class Student (school: String, subject: String) extends Worker {
  def workGreeting = &quot;I'm studying &quot; + subject + &quot; at &quot; + school + &quot;!&quot;
}
</pre></p>
<p>We can now create a <strong>Student</strong> object and request their greeting.</p>
<p><pre class="brush: scala;">
scala&gt; val anonymousStudent = new Student(&quot;The University of Texas at Austin&quot;, &quot;history&quot;)
anonymousStudent: Student = Student@734445b5

scala&gt; anonymousStudent.workGreeting
res32: java.lang.String = I'm studying history at The University of Texas at Austin!
</pre></p>
<p>Notice that the parameters school and subject were not preceded by <em>val</em> in the definition of <strong>Student</strong>. That means that they are not member fields of the <strong>Student</strong> class, which means that they cannot be accessed externally. For example, attempting to access the value provided for <strong>school</strong> for <strong>anonymousStudent</strong> fails.</p>
<p><pre class="brush: scala;">
scala&gt; anonymousStudent.school
&lt;console&gt;:11: error: value school is not a member of Student
anonymousStudent.school
</pre></p>
<p>Of course, internally, <strong>Student</strong> can use the values provided to such parameters, for example in defining the result of <strong>workGreeting</strong>. This sort of encapsulation hides properties of the objects of a class from code that is outside the class; this strategy can help reduce the degrees of freedom available to users of your code so that they only use what you want them to. In general, if others don&#8217;t need to use it, you shouldn&#8217;t make it available to them.</p>
<p>Returning to classes that are both <strong>Persons</strong> and <strong>Workers</strong>, when we define a <strong>ComputerScientist</strong>, we do a similar <em>extends &#8230; with</em> declaration as we did for <strong>Linguist</strong>.</p>
<p><pre class="brush: scala;">
class ComputerScientist (
  firstName: String,
  lastName: String,
  age: Int,
  val speciality: String,
  favoriteProgrammingLanguage: String
) extends Person(firstName, lastName, age, &quot;computer scientist&quot;) with Worker {

  def workGreeting =
    &quot;As a &quot; + occupation + &quot;, I work on &quot; + speciality + &quot;. Much of my code is written in &quot; + favoriteProgrammingLanguage + &quot;.&quot;

}
</pre></p>
<p>Let&#8217;s create <a href="http://www.cs.umass.edu/~mccallum/">Andrew McCallum</a> as a <strong>ComputerScientist</strong> object.</p>
<p><pre class="brush: scala;">
scala&gt; val andrewMcCallum = new ComputerScientist(&quot;Andrew&quot;, &quot;McCallum&quot;, 44, &quot;machine learning&quot;, &quot;Scala&quot;)
andrewMcCallum: ComputerScientist = ComputerScientist@493cd5ba

scala&gt; andrewMcCallum.workGreeting
res31: java.lang.String = As a computer scientist, I work on machine learning. Much of my code is written in Scala.
</pre></p>
<p>Because we redefined <strong>Linguist</strong> to be a <strong>Worker</strong>, we need to recreate Noam Chomsky using the new definition. (The creation looks the same as before, but it uses the new class definition that has been updated in the REPL.)</p>
<p><pre class="brush: scala;">
scala&gt; val noamChomsky = new Linguist(&quot;Noam&quot;, &quot;Chomsky&quot;, 83, &quot;syntactician&quot;, &quot;English&quot;)
noamChomsky: Linguist = Linguist@6fccaf14
</pre></p>
<p>A minor thing to note: the <strong>speciality</strong> field of <strong>ComputerScientist</strong> is disconnected from that of <strong>Linguist</strong>, so there is no particular expectation of consistency of use across the two: for <strong>Linguist</strong> it is a description of a person working in a sub-area but for <strong>ComputerScientist</strong> is a description of a sub-area.</p>
<p>So, what happens if we put noamChomsky and andrewMcCallum in a List together?</p>
<p><pre class="brush: scala;">
scala&gt; val professors = List(noamChomsky, andrewMcCallum)
professors: List[Person with Worker] = List(Linguist@6fccaf14, ComputerScientist@493cd5ba)
</pre></p>
<p>Scala has created a list with type <strong>List[Person with Worker]</strong>; this is the most specific type that is valid for all elements of the list. It means we can treat all of the elements as <strong>Persons</strong>, e.g. accessing their <strong>occupation</strong> (which is a member field of <strong>Person</strong>).</p>
<p><pre class="brush: scala;">
scala&gt; professors.map(prof =&gt; prof.occupation)
res34: List[String] = List(linguist, computer scientist)
</pre></p>
<p>And we can treat each element of the list as a <strong>Person</strong> and a <strong>Worker</strong>, e.g. printing out their <strong>fullName</strong> (from <strong>Person</strong>) and their <strong>workGreeting</strong> (from <strong>Worker</strong>).</p>
<p><pre class="brush: scala;">
scala&gt; professors.foreach(prof =&gt; println(prof.fullName + &quot;: &quot; + prof.workGreeting))
Noam Chomsky: As a linguist, I am a syntactician who likes to study the language English.
Andrew McCallum: As a computer scientist, I work on machine learning. Much of my code is written in Scala.
</pre></p>
<p>We cannot, however, access fields and functions that are specific to <strong>Linguists</strong> or <strong>ComputerScientists</strong>, such as <strong>favoriteLanguage</strong> from <strong>Linguist</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; professors.map(prof =&gt; prof.favoriteLanguage)
&lt;console&gt;:15: error: value favoriteLanguage is not a member of Person with Worker
professors.map(prof =&gt; prof.favoriteLanguage)
</pre></p>
<p>It is easy to see why Scala has this behavior: even though that would have been valid for <strong>noamChomsky</strong>, it would not be for <strong>andrewMcCallum</strong> (according to the way we defined <strong>Linguist</strong> and <strong>ComputerScientist</strong>).</p>
<h2>Matching on types in polymorphic Lists</h2>
<p>Consider what happens when the <strong>anonymousStudent</strong> is in a list with the professors.</p>
<p><pre class="brush: scala;">
scala&gt; val workers = List(noamChomsky, andrewMcCallum, anonymousStudent)
workers: List[ScalaObject with Worker] = List(Linguist@6fccaf14, ComputerScientist@493cd5ba, Student@734445b5)
</pre></p>
<p>The <strong>Person</strong> type is gone, and we now have a list of a more general type <strong>ScalaObject with Worker</strong>. Now we can only use the <strong>workGreeting</strong> method from <strong>Worker</strong>.</p>
<p>However, it is worth pointing out that <em>match</em> statements come in handy when you have collections of heterogenous objects. For example, put the following code into the REPL.</p>
<p><pre class="brush: scala;">
val people = List(johnSmith, noamChomsky, andrewMcCallum, anonymousStudent)

people.foreach { person =&gt;
  person match {
    case x: Person with Worker =&gt; println(x.fullName + &quot;: &quot; + x.workGreeting)
    case x: Person =&gt; println(x.fullName + &quot;: &quot; + x.greet(true))
    case x: Worker =&gt; println(&quot;Anonymous:&quot; + x.workGreeting)
  }
}
</pre></p>
<p>The result is the following (remember that <strong>johnSmith</strong> was never defined as a <strong>Linguist</strong> &#8212; he was defined as a <strong>Person</strong> whose occupation is &#8220;linguist&#8221;).</p>
<p><pre class="brush: scala;">
John Smith: Hello, my name is John Smith. I'm a linguist.
Noam Chomsky: As a linguist, I am a syntactician who likes to study the language English.
Andrew McCallum: As a computer scientist, I work on machine learning. Much of my code is written in Scala.
Anonymous:I'm studying history at The University of Texas at Austin!
</pre></p>
<p>So, we can switch our behavior by matching to more specific types using Scala&#8217;s pattern matching.</p>
<h2>The apply function</h2>
<p>Scala provides a simple but incredibly nice feature: if you define an <strong>apply</strong> function in a class or object, you don&#8217;t actually need to write &#8220;apply&#8221; in order to use it. As an example, the following object adds one to an argument supplied to its <strong>apply</strong> method.</p>
<p><pre class="brush: scala;">
object AddOne {
  def apply (x: Int): Int = x+1
}
</pre></p>
<p>So, we can use it just like you&#8217;d normally expect.</p>
<p><pre class="brush: scala;">
scala&gt; AddOne.apply(3)
res41: Int = 4
</pre></p>
<p>But, we can also do without the &#8220;.apply&#8221; portion and get the same result.</p>
<p><pre class="brush: scala;">
scala&gt; AddOne(3)
res42: Int = 4
</pre></p>
<p>If a class has an <strong>apply</strong> method, then we can do the same trick with any object of that class.</p>
<p><pre class="brush: scala;">
class AddN (amountToAdd: Int) {
  def apply (x: Int): Int = x + amountToAdd
}

scala&gt; val add2 = new AddN(2)
add2: AddN = AddN@43ca04a1

scala&gt; add2(5)
res43: Int = 7

scala&gt; val add42 = new AddN(42)
add42: AddN = AddN@83e591f

scala&gt; add42(8)
res44: Int = 50
</pre></p>
<p>As it turns out, you&#8217;ve been using <strong>apply</strong> methods quite often, without knowing it! When you have a <strong>List</strong> and you access an element by index, you&#8217;ve used the <strong>apply</strong> method of the <strong>List</strong> class.</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = 10 to 20 toList
numbers: List[Int] = List(10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)

scala&gt; numbers(3)
res46: Int = 13

scala&gt; numbers.apply(3)
res47: Int = 13
</pre></p>
<p>Same thing for accessing values using keys in a <strong>Map</strong>, and similarly for many other of the classes you&#8217;ve been using in Scala so far.</p>
<h2>Wrap-up</h2>
<p>This tutorial has covered the basics of object-oriented programming in Scala. Hopefully, it is enough to give a decent sense of what objects and classes are and how you can do things with them. There is much much more to be learned about them, but this should be sufficient to get you started so that further study can be done meaningfully. It is important to understand these concepts since Scala is object-oriented from the ground up. In fact, in many of the previous tutorials, I&#8217;ve at times gone through some extra hoops to try to describe what is going on without having to talk about object-orientation. But now you can see things like Int, Double, List, Map, and so on for what they are: classes that contain particular fields and functions that you can use to get things done. You can now start coding your own classes to enable your own custom behaviors in your applications.</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/162/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/162/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/162/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=162&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/10/24/first-steps-in-scala-for-beginning-programmers-part-9/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>Belle Scarlett Baldridge</title>
		<link>https://bcomposes.wordpress.com/2011/10/05/belle-scarlett-baldridge/</link>
		<comments>https://bcomposes.wordpress.com/2011/10/05/belle-scarlett-baldridge/#comments</comments>
		<pubDate>Wed, 05 Oct 2011 05:42:29 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[life]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=154</guid>
		<description><![CDATA[In loving memory of Belle Scarlett Baldridge September 29, 2011 I buried my baby daughter Belle today. It wasn’t supposed to be this way. Babies just aren’t supposed to die. We are fortunate to live in a time of favorable survival rates for babies and their mothers. We enjoy high degrees of order and predictability &#8230;<p><a href="https://bcomposes.wordpress.com/2011/10/05/belle-scarlett-baldridge/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=154&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p style="text-align:center;"><strong>In loving memory of Belle Scarlett Baldridge</strong><br />
<em>September 29, 2011</em></p>
<p style="text-align:left;">I buried my baby daughter Belle today. It wasn’t supposed to be this way. Babies just aren’t supposed to die. We are fortunate to live in a time of favorable survival rates for babies and their mothers. We enjoy high degrees of order and predictability in our day-to-day lives (here in the USA, at least), and it is easy to forget that one still has innocence to lose. This has been the saddest, hardest week of my life. I had always heard that a parent should never have to bury their own child. I didn’t doubt it, but now I know it, fully. This morning, I gazed down at a gaping hole, my little girl’s grave, while I held her casket in my arms. It mirrored the hole already in my heart. It disarmed and terrified me, but also showed me that both were there to receive Belle and preserve her memory.</p>
<p>With this post, I seek to honor and remember Belle, to thank those who have supported us this week, to help myself grieve, and hopefully, to help&#8212;perhaps a little&#8212;others in the future who must unfortunately deal with the death of their child. My apologies if the post is on the (melo)dramatic side. It’s how I feel, and it seems to be part of my healing process, so please bear with me.</p>
<p>My wife Cheryl and I had long been anticipating Belle’s arrival, with a due date of today &#8211; October 4, 2011. Like most expecting parents, we had considered many of the possible outcomes of the pregnancy, including even the possibility of complications that would involve our baby and/or Cheryl needing hospitalization &#8212; but never the possibility that our baby Belle wouldn’t make it into this world, never the possibility of a stillbirth. The unyielding march of life and death has left us suddenly and unexpectedly bereft of a person we loved, cared for and were ready to teach and eventually send forth into the world.</p>
<p>We knew Belle from her kicks, and her responses to our voices, songs, and laughter. It’s an imperfect medium of communication, but it suffices to start the relationship that one builds with one’s child &#8212; they simply aren’t strangers when you see them for the first time. This is something that can perhaps be hard to understand for those who have not yet had children, and it is a common source of pain for parents of stillborn children: it is somehow perceived by many to not be as great a loss as for those whose children died after their birth date. A great line I read in one of the many materials I’ve been given about such loss is that on a scale of one to ten, the pain of losing a child is always a ten, no matter the age or circumstances. It’s true. I would submit that there is a further dynamic element for parents of a stillborn child: you have gone from a state of accelerating excitement and anticipation, to a huge resounding thud of shock and disbelief. The “what if’s” have in very short order become “never be’s.” This sudden reversal kicks in the first moment you are told that your baby’s heartbeat has stopped and then reverberates as you reel from the pain and try to regroup.</p>
<p>Little Belle is true to her name: she is beautiful, even in death. I can now only imagine what she would have looked like as she grew up, but thankfully I can do at least that. And, I can do that from a starting point of having been able to spend time with her on the day she was born, September 29, 2011. We had a wonderful team with us at Belle’s birth&#8212;including doctors, midwives, nurses, and doula&#8212;and they helped us through the intensely emotional and difficult process of bringing Belle into the world and, perhaps more importantly, to help us spend meaningful time with her before saying goodbye. They encouraged us to be with Belle, to hold her and take pictures, and not rush things. We now have at least those memories&#8212;even so bittersweet&#8212;to keep with us, something which many parents of stillborn babies are never given because no one tells them they can and should. This is a really important aspect of Belle’s birth that I hope to get across: you are hurting and spinning from the shock and pain, yet there are important decisions to be made from the very start; while you may have been provided with comprehensive and well-written literature on how to approach the situation, you have little emotional space for it and there is too much of it to possibly work through before you must make decisions.  If you or someone you care for finds themselves unfortunately in this situation, try to get across this message: take time with the baby and take pictures. You won’t get more chances later, and you’ll almost surely regret it if you don’t.</p>
<p>Another important thing for us was to have a small memorial service for Belle, and also a burial. As an agnostic without any religious affiliation, I had no default expectation for what to do. Cheryl and I had years ago decided that cremation would be the thing for us eventually. However, with Belle, Cheryl quickly realized that she wanted a place to visit her, so we went with a burial. I did not feel strongly about it, but it felt right to me when we did it today, so I’ll probably be glad for that choice in the long run. It was very hard to pick out her plot at the cemetery on Friday&#8212;it’s an area reserved for infants, a grid of small plots that serves as a concrete reminder of the fragility of the early days of life. Looking at the empty spot where Belle would be buried made it all seem more real, more this-is-really-happening, in the mix of surreal feelings of that day and the previous day. Of course, handing over a credit card to pay for the services and the plot then felt bizarre, an odd juxtaposition of a completely mundane action with the profound grief I was keeping in check. Regardless of that strangeness, it is one of those things which just must be done. Belle is now there, and it is a peaceful place, with trees and birds singing in them.</p>
<p>It turns out that stillbirths are more common than I would have ever thought. I had only directly known of one before Belle, and had assumed it must have been a case of extreme misfortune. Actually, in the USA, the average rate of stillbirths is roughly 1 in 150 births, about 26,000 babies every year. The rate is much higher in developing countries. Despite this prevalence, there apparently is not a great deal of research into it (and it seems to be an inherently difficult thing to research), so we still know little about specific actions that can be taken to prevent it. For the things we do know, such as tangled umbilical cords, there is very little warning &#8212; there is a window of perhaps 5-10 minutes from the time of fetal distress in which to save the baby. Knowing this actually relieved us of a great deal of guilt as we had initially second guessed ourselves, retracing our steps in the days leading up to Belle’s birth and imagining ways we could/should have known to try to get her out earlier.</p>
<p>Regardless of the statistics, regardless of whether we’ll know the cause of Belle’s death, it all just ends up feeling unfair. I’ve been robbed of my little girl, whose heart I had heard beating just days before. Belle should have had her fair shot at life, and I’m sure she would have made hers a great one. It shouldn’t have been this way, but that is what happened and now we must live with that and move on. In this, I’m so thankful for the amazing relationship I have with Cheryl. We’re both hurting, immensely, but we also are optimists who have both already overcome our fair share of challenges in our lives. Together, and with the help of family and friends, we’ll regroup and carry on, carrying Belle’s memory with us.</p>
<p>Little Belle, I’ll love you forever.</p>
<p style="text-align:left;"><strong>Addendum</strong></p>
<p style="text-align:left;">There are many people who have provided us with amazing, and often unexpected, support over the last week.</p>
<p>Our <a href="http://en.wikipedia.org/wiki/Doula">doula</a>, <a href="http://shelleyscotka.com/">Shelley Scotka</a>, was our shining light on the day of Belle’s birth. Many people have probably never heard of doulas &#8212; summarizing quickly, they are amazing women who assist in natural childbirth. They bring their knowledge of traditional birthing techniques and practical experience from many births to bear on yours, including translating what the doctors are saying and doing so that you hear what is going on, in simple, understandable terms. Shelley was there for our son’s delivery, a 50+ hour marathon that she did a great deal to ease. Little did we know that she would be every bit as vital for us for a stillbirth as she was for a live birth. She was a rock who helped before, during and after the delivery, and who continues to shower us with love and care.</p>
<p>We’re also incredibly thankful for the medical team that delivered Belle last Thursday at St. David’s North Austin. Our practice is <a href="http://www.obgynnorth.com/">OB-GYN North</a>, and the midwives, doctor, and technician who had to tell us that Belle’s heartbeat had stopped were caring and kind, and helped us immensely with the initial shock and disbelief. <a href="http://www.obgynnorth.com/our_clinic/our_staff/kathy_harrisonshort_cnm">Kathy Harrison-Short, CNM</a>  had caught our son two years before and she immediately came to comfort us. <a href="http://www.obgynnorth.com/our_clinic/our_staff/lisa_carlile_cnm">Lisa Carlile, CNM</a> stayed past her shift and was the one who ultimately caught Belle, at Cheryl’s request. <a href="http://www.seton.net/find_a_physician/schmitz/martha">Dr. Martha Smitz</a> was the physician on duty that day. She demonstrated tremendous sensitivity, compassion and overwhelming competence throughout. She had an uncanny ability to put us at ease even in the midst of the sorrow and confusion we were going through. The nurses, other doctors, social worker and pastor were all similarly supportive and sensitive. The nurses deserve special thanks for taking such great care of Cheryl before the delivery and of Belle after it. Everyone treated us, and Belle, with tremendous dignity.</p>
<p>Since that day, our family, friends and colleagues have been incredibly supportive. One of the blessings in tragedy is the concrete realization that one is surrounded by a wonderful support network. My younger brother lives here in Austin and my mother had just arrived, ready to help us with Belle; they’ve been helping us through the whole thing, especially with our toddler son, even while dealing with their own loss and grief. My father flew in from Chicago, and my older brother immediately came over from Baton Rouge with his daughter. The sound of her playing with our toddler son over the weekend was a welcome, joyful addition that helped combat the otherwise tendency toward a somber mood. My brother’s wife helped us a great deal from afar, providing support both as a family member and as a practicing physician. My step-father will be here soon, a delayed visit (at my request) since I knew we’d need more backup once the main family contingent was gone.</p>
<p>Other have also given us great strength, including sharing their own pain and anger at the situation, and in a few cases, their own direct experience with stillbirths. There have been generous offers of help, including offers to teach some of my classes in the coming weeks. Though I’ve so far responded to almost none of them, I’ve read and appreciated every email of support from friends, colleagues, and students. In a way, this post is my response, so please consider this my thank you to you all. And to those who I have not yet gotten in touch with about Belle’s death, please understand that there has not been any particular plan or care with my communications regarding it &#8212; I’m just now getting geared up to pass the word on to more friends, and some of you are probably seeing this post as a result of that effort.</p>
<p>I must also give high praise to the people at <a href="http://www.cookwaldenfuneralhome.com/dm20/en_US/locations/48/4884/index.page">Cook-Walden</a> funeral homes. They have treated us very kindly and have been incredibly responsive to our needs. One of the things about the situation is that many decisions must be made in rapid succession, and you get some of them not-quite-right the first time around. Cook-Walden was very accommodating to changes in how we wanted to do the service and burial and to requests for articles of Belle’s that we only realized later that we’d want (such as a lock of her hair). They treated us and Belle with dignity and allowed us time and space to make decisions and say goodbye to her.</p>
<p>Finally, I must thank the volunteers from <a href="http://www.nowilaymedowntosleep.org/home/">Now I Lay Me Down To Sleep</a>, who Shelley called in for us. NILMDTS is a non-profit that has professional photographers who come to take pictures of stillborn babies and their families, and then later retouch them to provide nicer images of the baby than one could generally hope to capture by oneself. They were caring and professional, and we look forward to seeing the result of their work with Belle. If you are looking for a great non-profit to donate to, please consider them.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/154/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/154/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/154/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=154&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/10/05/belle-scarlett-baldridge/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 8</title>
		<link>https://bcomposes.wordpress.com/2011/09/19/first-steps-in-scala-for-beginning-programmers-part-8/</link>
		<comments>https://bcomposes.wordpress.com/2011/09/19/first-steps-in-scala-for-beginning-programmers-part-8/#comments</comments>
		<pubDate>Mon, 19 Sep 2011 17:02:44 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=133</guid>
		<description><![CDATA[Topics: scala.io.Source, accessing files, flatMap, mutable Maps Preface This is part 8 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. This tutorial is about accessing the &#8230;<p><a href="https://bcomposes.wordpress.com/2011/09/19/first-steps-in-scala-for-beginning-programmers-part-8/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=133&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>scala.io.Source, accessing files, flatMap, mutable Maps<br />
</em></p>
<h2>Preface</h2>
<p>This is part 8 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.</p>
<p>This tutorial is about accessing the file system in order to work with text files. The previous tutorial showed how to build a Map that contains the counts of each word type in a given text. However, it was assumed that the text was available in a String variable, and typically we are interested in knowing things about files that live on the file system, or on the internet. This tutorial shows how to read a file&#8217;s contents into Scala for processing, both by building a single String for the file or by consuming it line-by-line in a streaming fashion. Along the way, immutable Maps are introduced as a way to enable word counting without reading an entire file into memory.</p>
<h2>Word count on the contents of a file</h2>
<p>As an example, we&#8217;ll use <a href="http://www.gutenberg.org/cache/epub/1661/pg1661.txt">the complete Sherlock Holmes from project Gutenberg</a>. Download it, put it into a directory, and then start up the Scala REPL in that directory. To access files, we&#8217;ll use the <strong>Source</strong> class, so to start you need to import it.</p>
<p><pre class="brush: scala;">
scala&gt; import scala.io.Source
import scala.io.Source
</pre></p>
<p><strong>Source</strong> provides a number of ways to interact with files and make them accessible to you in your Scala program. The <strong>fromFile</strong> method is the one you&#8217;ll probably need most.</p>
<p><pre class="brush: scala;">
scala&gt; Source.fromFile(&quot;pg1661.txt&quot;)
res3: scala.io.BufferedSource = non-empty iterator
</pre></p>
<p>This creates a <strong>BufferedSource</strong>, from which you can easily get all of file&#8217;s contents as a String.</p>
<p><pre class="brush: scala;">
scala&gt; val holmes = Source.fromFile(&quot;pg1661.txt&quot;).mkString
holmes: String =
&quot;Project Gutenberg's The Adventures of Sherlock Holmes, by Arthur Conan Doyle

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.net
&lt;...many more lines...&gt;
</pre></p>
<p>With this, you can do the same things as shown it <a title="First steps in Scala for beginning programmers, Part 7" href="http://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/">tutorial 7</a> to get the word counts (except that here we&#8217;ll split on white space sequences rather than just a single space).</p>
<p><pre class="brush: scala;">
scala&gt; val counts = holmes.split(&quot;\\s+&quot;).groupBy(x=&gt;x).mapValues(x=&gt;x.length)
counts: scala.collection.immutable.Map[java.lang.String,Int] = Map(wood-work, -&gt; 1, &quot;Pray, -&gt; 1, herself. -&gt; 2, stern-post -&gt; 1, &quot;Should -&gt; 1, incident -&gt; 8, serious -&gt; 14, earth--&quot; -&gt; 2, sinister -&gt; 10, comply -&gt; 7, breaks -&gt; 1, forgotten -&gt; 3, precious -&gt; 10, 'It -&gt; 3, compliment -&gt; 2, suite, -&gt; 1, &quot;DEAR -&gt; 1, summarise. -&gt; 1, &quot;Done -&gt; 1, fine.' -&gt; 1, lover -&gt; 5, of. -&gt; 2, lead. -&gt; 1, plentiful -&gt; 1, 'Lone -&gt; 4, malignant -&gt; 1, terrible -&gt; 14, rate -&gt; 1, mole -&gt; 1, assert -&gt; 1, lights -&gt; 2, Stevenson, -&gt; 1, submitted -&gt; 4, tap. -&gt; 1, beard, -&gt; 1, band--a -&gt; 1, force! -&gt; 1, snow -&gt; 7, Produced -&gt; 2, ask, -&gt; 1, purchasing -&gt; 1, Hall, -&gt; 1, wall. -&gt; 5, remarked -&gt; 32, laughing -&gt; 4, member.&quot; -&gt; 1, 30,000 -&gt; 2, Redistributing -&gt; 1, coat, -&gt; 6, &quot;'One -&gt; 2, 'band,' -&gt; 1, relapsed -&gt; 1, apol...

scala&gt; counts(&quot;Holmes&quot;)
res2: Int = 197

scala&gt; counts(&quot;Watson&quot;)
res3: Int = 4
</pre></p>
<p>Lest you think it strange that <em>Watson</em> only shows up four times, keep in mind that we split on whitespace, and that means that in a sentence like the following, the token of interest is <em>Watson,&#8221;</em> rather than <em>Watson</em>.</p>
<p style="padding-left:30px;"><em>&#8220;You could not possibly have come at a better time, my dear Watson,&#8221; he said cordially.</em></p>
<p>Looking that and others up shows more tokens containing <em>Watson</em> in the story.</p>
<p><pre class="brush: scala;">
scala&gt; counts(&quot;Watson,\&quot;&quot;)
res4: Int = 19

scala&gt; counts(&quot;Watson,&quot;)
res5: Int = 40

scala&gt; counts(&quot;Watson.&quot;)
res6: Int = 10
</pre></p>
<p>Of course, the real problem is that tokenizing on whitespace is too crude. To do this properly generally takes a good hand-built tokenizer (which is able to keep tokens like <em>e.g.</em> and <em>Mr.</em> and <em>Yahoo!</em> while splitting punctuation off most words) or a machine learned one that is trained on data hand-labeled for tokens. For an example of the latter, see <a href="http://incubator.apache.org/opennlp/documentation/manual/opennlp.html#tools.tokenizer">the Apache OpenNLP toolkit tokenizers</a>, which includes pre-trained models for English.</p>
<h2>Working line by line</h2>
<p>Quite often, you need to work through a file line by line, rather than reading the entire thing in as a single string as we did above. For example, you might need to process each line differently, so just having it as a single String isn&#8217;t particular convenient. Or, you might be working with a large file that cannot easily fit into memory (which is what happens when you read in the entire string). You can obtain the lines in the file as an <strong>Iterator[String]</strong>, in which each item is a single line from the file, using the <strong>getLines</strong> method.</p>
<p><pre class="brush: scala;">
scala&gt; Source.fromFile(&quot;pg1661.txt&quot;).getLines
res4: Iterator[String] = non-empty iterator
</pre></p>
<p>This iterator is ready for you to consume lines, but it doesn&#8217;t read all of the file into memory right away &#8212; instead it buffers it such that each line will be available for you as you ask for it, essentially reading off disk as you demand more lines. You can think of this as <em>streaming</em> the file to your Scala program, much like modern audio and video content is streamed to your computer: it is never actually stored, but is just transferred in parts to where it is needed, when it is needed.</p>
<p>Of course, Iterators share much with sequence data structures like Lists: once we have an Iterator, we can use <strong>foreach</strong>, <strong>for</strong>, <strong>map</strong>, etc. on it. So to print out all of the lines in the file, we can do the following.</p>
<p><pre class="brush: scala;">
scala&gt; Source.fromFile(&quot;pg1661.txt&quot;).getLines.foreach(println)
Project Gutenberg's The Adventures of Sherlock Holmes, by Arthur Conan Doyle

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.net

Title: The Adventures of Sherlock Holmes

Author: Arthur Conan Doyle
&lt;...many more lines...&gt;
</pre></p>
<p>That creates a lot of output, but it shows you how you can easily create your own Scala implementation of the Unix <strong>cat</strong> program: just save the following line in a file called <strong>cat.scala</strong>:</p>
<p><pre class="brush: scala;">
scala.io.Source.fromFile(args(0)).getLines.foreach(println)
</pre></p>
<p>And then call that with the name of the file to list its contents.</p>
<p><pre class="brush: bash;">
$ scala cat.scala pg1661.txt
</pre></p>
<p>Back in the REPL, it is somewhat less-than-ideal to see the entire file. If you just want to see the start of the file, use the <strong>take</strong> method on the Iterator before the <strong>foreach</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; Source.fromFile(&quot;pg1661.txt&quot;).getLines.take(5).foreach(println)
Project Gutenberg's The Adventures of Sherlock Holmes, by Arthur Conan Doyle

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
</pre></p>
<p>The <strong>take</strong> method is quite useful in general with any sequence, and provides the complement of the drop method, as shown in the following examples on a simple <strong>List[Int]</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = 1 to 10 toList
numbers: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala&gt; numbers.take(3)
res12: List[Int] = List(1, 2, 3)

scala&gt; numbers.drop(3)
res13: List[Int] = List(4, 5, 6, 7, 8, 9, 10)

scala&gt; numbers.take(3) ::: numbers.drop(3)
res14: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
</pre></p>
<h2>Word counting line by line, first try</h2>
<p>Now that we&#8217;ve seen how to read a file and start working with it line-by-line, how do we count the number of occurrences of each word? Recall from <a title="First steps in Scala for beginning programmers, Part 7" href="http://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/">tutorial 7 </a>and above that the starting point was to have a sequence (Array, List, etc) of Strings in which each element is a word token. To start moving toward that, we can simply use the <strong>toList</strong> method on the <strong>Iterator[String]</strong> obtained from <strong>getLines</strong>.</p>
<p><pre class="brush: scala;">
scala&gt; val holmes = Source.fromFile(&quot;pg1661.txt&quot;).getLines.toList
holmes: List[String] = List(The Project Gutenberg EBook of The Adventures of Sherlock Holmes, by Sir Arthur Conan Doyle, (#15 in our series by Sir Arthur Conan Doyle), &quot;&quot;, Copyright laws are changing all over the world. Be sure to check the, copyright laws for your country before downloading or redistributing, this or any other Project Gutenberg eBook., &quot;&quot;, This header should be the first thing seen when viewing this Project, Gutenberg file.  Please do not remove it.  Do not change or edit the, header without written permission., &quot;&quot;, Please read the &quot;legal small print,&quot; and other information about the, eBook and Project Gutenberg at the bottom of this file.  Included is, important information about your specific rights and restrictions in, how the file may be used.  You can also find ou...
</pre></p>
<p>We now have the contents of the file as a <strong>List[String]</strong>, and may proceed to do useful things with it. For example, we could <strong>map</strong> each line (Strings) to be sequences of whitespace-separated Strings.</p>
<p><pre class="brush: scala;">
scala&gt; val listOfListOfWords = Source.fromFile(&quot;pg1661.txt&quot;).getLines.toList.map(x =&gt; x.split(&quot; &quot;).toList)
listOfListOfWords: List[List[java.lang.String]] = List(List(Project, Gutenberg's, The, Adventures, of, Sherlock, Holmes,, by, Arthur, Conan, Doyle), List(&quot;&quot;), List(This, eBook, is, for, the, use, of, anyone, anywhere, at, no, cost, and, with), List(almost, no, restrictions, whatsoever., &quot;&quot;, You, may, copy, it,, give, it, away, or), List(re-use, it, under, the, terms, of, the, Project, Gutenberg, License, included), List(with, this, eBook, or, online, at, www.gutenberg.net), List(&quot;&quot;), List(&quot;&quot;), List(Title:, The, Adventures, of, Sherlock, Holmes), List(&quot;&quot;), List(Author:, Arthur, Conan, Doyle), List(&quot;&quot;), List(Posting, Date:, April, 18,, 2011, [EBook, #1661]), List(First, Posted:, November, 29,, 2002), List(&quot;&quot;), List(Language:, English), List(&quot;&quot;), List(&quot;&quot;), List(***, START, OF, THIS, PRO...
</pre></p>
<p>And, as we saw in <a title="First steps in Scala for beginning programmers, Part 7" href="http://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/">tutorial 7</a>, when we have a List of Lists, we can use <strong>flatten</strong> to create one big List.</p>
<p><pre class="brush: scala;">
scala&gt; val listOfWords = listOfListOfWords.flatten
listOfWords: List[java.lang.String] = List(Project, Gutenberg's, The, Adventures, of, Sherlock, Holmes,, by, Arthur, Conan, Doyle, &quot;&quot;, This, eBook, is, for, the, use, of, anyone, anywhere, at, no, cost, and, with, almost, no, restrictions, whatsoever., &quot;&quot;, You, may, copy, it,, give, it, away, or, re-use, it, under, the, terms, of, the, Project, Gutenberg, License, included, with, this, eBook, or, online, at, www.gutenberg.net, &quot;&quot;, &quot;&quot;, Title:, The, Adventures, of, Sherlock, Holmes, &quot;&quot;, Author:, Arthur, Conan, Doyle, &quot;&quot;, Posting, Date:, April, 18,, 2011, [EBook, #1661], First, Posted:, November, 29,, 2002, &quot;&quot;, Language:, English, &quot;&quot;, &quot;&quot;, ***, START, OF, THIS, PROJECT, GUTENBERG, EBOOK, THE, ADVENTURES, OF, SHERLOCK, HOLMES, ***, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, Produced, by, an, anonymous, Project, Gut...
</pre></p>
<p>But, now you might recognize that this is the <em>map-then-flatten</em> pattern we saw previously, which means we can <strong>flatMap</strong> it instead.</p>
<p><pre class="brush: scala;">
scala&gt; val flatMappedWords = Source.fromFile(&quot;pg1661.txt&quot;).getLines.toList.flatMap(x =&gt; x.split(&quot; &quot;))
flatMappedWords: List[java.lang.String] = List(Project, Gutenberg's, The, Adventures, of, Sherlock, Holmes,, by, Arthur, Conan, Doyle, &quot;&quot;, This, eBook, is, for, the, use, of, anyone, anywhere, at, no, cost, and, with, almost, no, restrictions, whatsoever., &quot;&quot;, You, may, copy, it,, give, it, away, or, re-use, it, under, the, terms, of, the, Project, Gutenberg, License, included, with, this, eBook, or, online, at, www.gutenberg.net, &quot;&quot;, &quot;&quot;, Title:, The, Adventures, of, Sherlock, Holmes, &quot;&quot;, Author:, Arthur, Conan, Doyle, &quot;&quot;, Posting, Date:, April, 18,, 2011, [EBook, #1661], First, Posted:, November, 29,, 2002, &quot;&quot;, Language:, English, &quot;&quot;, &quot;&quot;, ***, START, OF, THIS, PROJECT, GUTENBERG, EBOOK, THE, ADVENTURES, OF, SHERLOCK, HOLMES, ***, &quot;&quot;, &quot;&quot;, &quot;&quot;, &quot;&quot;, Produced, by, an, anonymous, Project,...
</pre></p>
<p>But you should be a bit bothered by all this: wasn&#8217;t the idea here (in part) not to read all of the lines in at once? Indeed, with what we did above, as soon as we said <strong>toList</strong> on the Iterator, the whole file was read into memory. However, we can do without the <strong>toList</strong> step and just directly <strong>flatMap</strong> the Iterator and get a new Iterator over the tokens rather than the lines.</p>
<p><pre class="brush: scala;">
scala&gt; val flatMappedWords = Source.fromFile(&quot;pg1661.txt&quot;).getLines.flatMap(x =&gt; x.split(&quot; &quot;))
flatMappedWords: Iterator[java.lang.String] = non-empty iterator
</pre></p>
<p>Now, if we want to count the words, we can convert that to a List and do the <strong>groupBy</strong> the <strong>mapValues</strong> trick we&#8217;ve seen already (output omitted).</p>
<p><pre class="brush: scala;">
scala&gt; val counts = Source.fromFile(&quot;pg1661.txt&quot;).getLines.flatMap(x =&gt; x.split(&quot; &quot;)).toList.groupBy(x=&gt;x).mapValues(x=&gt;x.length)
</pre></p>
<p>Oops &#8212; that worked, but we once again brought the whole file into memory because the List that was created from <strong>toList</strong> has all lines for the file. We&#8217;ll see next how to use a <em>mutable</em> <strong>Map</strong> to get around this.</p>
<h2>Word counting by streaming with an Iterator and using mutable Maps</h2>
<p>In all of the tutorials so far, I&#8217;ve pretty much stuck to immutable data structures except when mutable ones show up due to context (like Arrays coming out of the <strong>toString</strong> method). It&#8217;s good to try to make use of immutable data structures where possible, but there are times when mutable ones are more convenient and perhaps more appropriate.</p>
<p>With the immutable Maps we saw in the previous tutorial, you could not change the assignment to a key, nor could you add a new key.</p>
<p><pre class="brush: scala;">
lettersToNumbers: scala.collection.immutable.Map[java.lang.String,Int] = Map(A -&gt; 1, B -&gt; 2, C -&gt; 3)

[sourcecode language=&quot;scala&quot;]
scala&gt; lettersToNumbers(&quot;A&quot;) = 4
&lt;console&gt;:9: error: value update is not a member of scala.collection.immutable.Map[java.lang.String,Int]
lettersToNumbers(&quot;A&quot;) = 4

scala&gt; lettersToNumbers(&quot;D&quot;) = 5
&lt;console&gt;:9: error: value update is not a member of scala.collection.immutable.Map[java.lang.String,Int]
lettersToNumbers(&quot;D&quot;) = 5
</pre></p>
<p>There is another kind of Map, <strong>scala.collection.mutable.Map</strong>, that does allow this sort of behavior.</p>
<p><pre class="brush: scala;">
scala&gt; import scala.collection.mutable
import scala.collection.mutable

scala&gt; val mutableLettersToNumbers = mutable.Map(&quot;A&quot;-&gt;1, &quot;B&quot;-&gt;2, &quot;C&quot;-&gt;3)
mutableLettersToNumbers: scala.collection.mutable.Map[java.lang.String,Int] = Map(C -&gt; 3, B -&gt; 2, A -&gt; 1)

scala&gt; mutableLettersToNumbers(&quot;A&quot;) = 4

scala&gt; mutableLettersToNumbers(&quot;D&quot;) = 5

scala&gt; mutableLettersToNumbers
res4: scala.collection.mutable.Map[java.lang.String,Int] = Map(C -&gt; 3, D -&gt; 5, B -&gt; 2, A -&gt; 4)
</pre></p>
<p>It also has a handy way to increase the count associated with a key, using the <strong>+=</strong> method.</p>
<p><pre class="brush: scala;">
scala&gt; mutableLettersToNumbers(&quot;D&quot;) += 5

scala&gt; mutableLettersToNumbers
res6: scala.collection.mutable.Map[java.lang.String,Int] = Map(C -&gt; 3, D -&gt; 10, B -&gt; 2, A -&gt; 4)
</pre></p>
<p>However, we can&#8217;t use that method with a key that doesn&#8217;t exist.</p>
<p><pre class="brush: scala;">
scala&gt; mutableLettersToNumbers(&quot;E&quot;) += 1
java.util.NoSuchElementException: key not found: E
&lt;...stacktrace...&gt;
</pre></p>
<p>Fortunately, we can provide a default. Here&#8217;s an example of starting a new Map with a default of 0.</p>
<p><pre class="brush: scala;">
scala&gt; val counts = mutable.Map[String,Int]().withDefault(x=&gt;0)
counts: scala.collection.mutable.Map[String,Int] = Map()

scala&gt; counts(&quot;Z&quot;) += 1

scala&gt; counts(&quot;Y&quot;) += 1

scala&gt; counts(&quot;Z&quot;) += 1

scala&gt; counts
res11: scala.collection.mutable.Map[String,Int] = Map(Z -&gt; 2, Y -&gt; 1)
</pre></p>
<p><em>Note</em>: when you start with some values already in a Map, Scala can infer the types of the keys and the values, but when initializing an empty Map, it is necessary to explicitly declare the key and value types.</p>
<p>With this in hand, here is how we can use <strong>flatMap</strong> plus a mutable Map to count words in a text without reading the entire text into memory.</p>
<p><pre class="brush: scala;">
import scala.collection.mutable
val counts = mutable.Map[String, Int]().withDefault(x=&gt;0)
for (token &lt;- scala.io.Source.fromFile(&quot;pg1661.txt&quot;).getLines.flatMap(x =&gt;x.split(&quot;\\s+&quot;)))
counts(token) += 1
</pre></p>
<p>Having created the counts Map in this way, we can convert it to an immutable Map with the <strong>toMap</strong> method once we are done adding elements.</p>
<p><pre class="brush: scala;">
scala&gt; val fixedCounts = counts.toMap
fixedCounts: scala.collection.immutable.Map[String,Int] = Map(wood-work, -&gt; 1,
&lt;...output truncated...&gt;
</pre></p>
<p>Now we can&#8217;t modify the values on <em>fixedCounts</em>, which has advantages in many contexts, e.g. we can&#8217;t accidentally destroy values or add unwanted keys, and there are (positive) implications for parallel processing.</p>
<p><pre class="brush: scala;">
scala&gt; fixedCounts(&quot;Holmes&quot;) = 0
&lt;console&gt;:13: error: value update is not a member of scala.collection.immutable.Map[String,Int]
fixedCounts(&quot;Holmes&quot;) = 0
^
</pre></p>
<h2>Reading a file from a URL</h2>
<p>As it turns out <strong>scala.io.Source</strong> can do a lot more than read from a file. Another example is to read from a URL to access a file on the internet, using the <strong>fromURL</strong> method.</p>
<p><pre class="brush: scala;">
val holmesUrl = &quot;&quot;&quot;http://www.gutenberg.org/cache/epub/1661/pg1661.txt&quot;&quot;&quot;
for (line &lt;- Source.fromURL(holmesUrl).getLines)
println(line)
</pre></p>
<p>If you are just going to analyze the same file again and again, this is probably not what you need &#8212; just download the file and use it locally. However, it can be quite useful in contexts where you are exploring links within pages (e.g. while processing Wikipedia or Twitter data) and need to read in content from URLs on the fly.</p>
<h2>Use (up) the Source</h2>
<p>A final note on the Iterators you get with Source.fromFile and Source.fromURL: you can only iterate through them once! This is part of what makes them more efficient &#8212; they aren&#8217;t holding all thattext in memory. So, don&#8217;t be surprised if you get the following behavior.</p>
<p><pre class="brush: scala;">

scala&gt; val holmesIterator = Source.fromFile(&quot;pg1661.txt&quot;).getLines
 holmesIterator: Iterator[String] = non-empty iterator

scala&gt; holmesIterator.foreach(println)

Project Gutenberg's The Adventures of Sherlock Holmes, by Arthur Conan Doyle

This eBook is for the use of anyone anywhere at no cost and with
 almost no restrictions whatsoever.  You may copy it, give it away or
 re-use it under the terms of the Project Gutenberg License included
 with this eBook or online at www.gutenberg.net

&lt;...many lines of output...&gt;

This Web site includes information about Project Gutenberg-tm,
 including how to make donations to the Project Gutenberg Literary
 Archive Foundation, how to help produce our new eBooks, and how to
 subscribe to our email newsletter to hear about new eBooks.

scala&gt; holmesIterator.foreach(println)

&lt;...nothing output!...&gt;

</pre></p>
<p>So, the Iterator is used up! If you want to go through the file again, you&#8217;ll need to spin up a new Iterator just like you did the first time around. The neat thing about staying with the Iterators and not converting to Lists (and thus bringing everything into memory) is that each mapping operation we do on the Iterator applies only for the current item we are looking at, so we never need to read the whole file into memory.</p>
<p>Of course, if you have a reasonably small file to work with, you should feel absolutely free to <strong>toList</strong> it and work with it that way if you prefer &#8212; it will often be more convenient since you can do the <strong>groupBy</strong> and <strong>mapValue</strong> pattern.</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/133/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/133/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/133/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=133&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/09/19/first-steps-in-scala-for-beginning-programmers-part-8/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
		<item>
		<title>First steps in Scala for beginning programmers, Part 7</title>
		<link>https://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/</link>
		<comments>https://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/#comments</comments>
		<pubDate>Tue, 13 Sep 2011 03:32:09 +0000</pubDate>
		<dc:creator>jasonbaldridge</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[scala]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://bcomposes.wordpress.com/?p=121</guid>
		<description><![CDATA[Topics: Maps, Sets, groupBy, Options, flatten, flatMap Preface This is part 7 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on the links page of the Computational Linguistics course I’m creating these for. Lists (and other sequence data structures, &#8230;<p><a href="https://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/" class="more-link">Read More</a></p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=121&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Topics</strong>: <em>Maps, Sets, groupBy, Options, flatten, flatMap</em></p>
<h2>Preface</h2>
<p>This is part 7 of tutorials for first-time programmers getting into Scala. Other posts are on this blog, and you can get links to those and other resources on <a href="http://icl-f11.utcompling.com/links">the links page of the Computational Linguistics course</a> I’m creating these for.</p>
<p>Lists (and other sequence data structures, like Ranges and Arrays) allow you to group collections of objects in an ordered manner: you can access elements of a list by indexing their position in the list, or iterate over the list elements, one by one, using <strong>for</strong> expressions and sequence functions like <strong>map</strong>, <strong>filter</strong>, <strong>reduce</strong> and <strong>fold</strong>. Another important kind of data structure is the associative array, which you&#8217;ll come to know in Scala as a <strong>Map</strong>. (Yes, this has the unfortunate ambiguity with the <strong>map</strong> <em>function</em>, but their use will be quite clear from context.) Maps allow you to store a collection of key-value pairs and to access the values by the keys associated with them, rather than via an index (as with a List).</p>
<p>Example cases where you could use a Map:</p>
<ul>
<li>Associating English words with their German translations</li>
<li>Associating each word with its count in a given text</li>
<li>Associating each word with its possible parts-of-speech</li>
</ul>
<p>You&#8217;ll see concrete examples of each of these in this post.</p>
<h2>Creating Maps and accessing their elements</h2>
<p>Maps are quite intuitive to grasp. Here&#8217;s an example with a few English words and their German translations. One easy way of creating a Map is by passing in a list of pairs, where the first element of each pair defines a key and the second defines a corresponding value.</p>
<p><pre class="brush: scala;">
scala&gt; val engToDeu = Map((&quot;dog&quot;,&quot;Hund&quot;), (&quot;cat&quot;,&quot;Katze&quot;), (&quot;rhinoceros&quot;,&quot;Nashorn&quot;))
engToDeu: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(dog -&gt; Hund, cat -&gt; Katze, rhinoceros -&gt; Nashorn)
</pre></p>
<p>Notice that the Map entries are of the form <em>key -&gt; value</em>. We may then retrieve the German translation for <em>dog</em> by providing the key &#8220;<em>dog</em>&#8221; to the Map we created.</p>
<p><pre class="brush: scala;">
scala&gt; engToDeu(&quot;dog&quot;)
res0: java.lang.String = Hund
</pre></p>
<p>Think for a moment what you would have to do to accomplish this with Lists. You&#8217;d need need two Lists, one for each language, and they&#8217;d need to be aligned so that each element in one list corresponded to its translation in the other list.</p>
<p><pre class="brush: scala;">
scala&gt; val engWords = List(&quot;dog&quot;,&quot;cat&quot;,&quot;rhinoceros&quot;)
engWords: List[java.lang.String] = List(dog, cat, rhinoceros)

scala&gt; val deuWords = List(&quot;Hund&quot;,&quot;Katze&quot;,&quot;Nashorn&quot;)
deuWords: List[java.lang.String] = List(Hund, Katze, Nashorn)
</pre></p>
<p>Then, to find the translation of <em>cat</em>, you would have to find the index of cat in <em>engWords</em>, and then look up that index in <em>deuWords</em>.</p>
<p><pre class="brush: scala;">
scala&gt; engWords.indexOf(&quot;cat&quot;)
res2: Int = 1

scala&gt; deuWords(engWords.indexOf(&quot;cat&quot;))
res3: java.lang.String = Katze
</pre></p>
<p>This is actually quite inefficient, as well as having other problems. Maps are the right thing for what we want here, and they do they job of retrieving values for keys quite efficiently.</p>
<p>It turns out that we can take two lists that are aligned in this way and construct a Map very easily. Recall that zipping two lists together creates one list of pairs, where each pair gives the elements that shared the same index.</p>
<p><pre class="brush: scala;">
scala&gt; engWords.zip(deuWords)
res4: List[(java.lang.String, java.lang.String)] = List((dog,Hund), (cat,Katze), (rhinoceros,Nashorn))
</pre></p>
<p>By calling the <strong>toMap</strong> method on such a List of pairs, we get a Map.</p>
<p><pre class="brush: scala;">
scala&gt; engWords.zip(deuWords).toMap
res5: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(dog -&gt; Hund, cat -&gt; Katze, rhinoceros -&gt; Nashorn)
</pre></p>
<p>Note that even though the REPL is showing the order of the key-value pairs to be the same as the original list we constructed the map from, there is no inherent order to the elements of a Map.</p>
<p>You can add elements to a Map to create a new Map using the <strong>+</strong> operator and an arrow <strong>-&gt;</strong> between each key and value pair.</p>
<p><pre class="brush: scala;">

scala&gt; engToDeu + &quot;owl&quot; -&gt; &quot;Eule&quot;
res6: (java.lang.String, java.lang.String) = (Map(dog -&gt; Hund, cat -&gt; Katze, rhinoceros -&gt; Nashorn)owl,Eule)

scala&gt; engToDeu + (&quot;owl&quot; -&gt; &quot;Eule&quot;, &quot;hippopotamus&quot; -&gt; &quot;Nilpferd&quot;)
res7: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(rhinoceros -&gt; Nashorn, dog -&gt; Hund, owl -&gt; Eule, hippopotamus -&gt; Nilpferd, cat -&gt; Katze)
</pre></p>
<p>You can add one Map to another using the <strong>++</strong> operator.</p>
<p><pre class="brush: scala;">

scala&gt; val newEntries = Map((&quot;hippopotamus&quot;, &quot;Nilpferd&quot;),(&quot;owl&quot;,&quot;Eule&quot;))
newEntries: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(hippopotamus -&gt; Nilpferd, owl -&gt; Eule)

scala&gt; val expandedEngToDeu = engToDeu ++ newEntries
expandedEngToDeu: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(rhinoceros -&gt; Nashorn, dog -&gt; Hund, owl -&gt; Eule, hippopotamus -&gt; Nilpferd, cat -&gt; Katze)
</pre></p>
<p>You can do the same by passing in a List of tuples to the ++ operator.</p>
<p><pre class="brush: scala;">

scala&gt; engToDeu ++ List((&quot;hippopotamus&quot;, &quot;Nilpferd&quot;),(&quot;owl&quot;,&quot;Eule&quot;))
res8: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(rhinoceros -&gt; Nashorn, dog -&gt; Hund, owl -&gt; Eule, hippopotamus -&gt; Nilpferd, cat -&gt; Katze)
</pre></p>
<p>And you can remove a key from a Map with the &#8211; operator.</p>
<p><pre class="brush: scala;">

scala&gt; engToDeu - &quot;dog&quot;
res9: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(cat -&gt; Katze, rhinoceros -&gt; Nashorn)
</pre></p>
<p>See<a href="http://www.scala-lang.org/api/current/scala/collection/immutable/Map.html"> the Map API</a> for more examples of such functions. Note: throughout this post, I&#8217;m sticking to immutable Maps &#8212; if you are looking at any other tutorials and are wondering why certain methods from those aren&#8217;t working here, they may have been using mutable Maps, which we&#8217;ll discuss later.</p>
<p>If we ask for the value associated with a key that doesn&#8217;t exist in the Map, we get an error.</p>
<p><pre class="brush: scala;">
scala&gt; engToDeu(&quot;bird&quot;)
java.util.NoSuchElementException: key not found: bird
at scala.collection.MapLike$class.default(MapLike.scala:224)
(etc.)
</pre></p>
<p>You can check for whether a key is in the Map using the <strong>contains</strong> method.</p>
<p><pre class="brush: scala;">
scala&gt; engToDeu.contains(&quot;bird&quot;)
res10: Boolean = false

scala&gt; engToDeu.contains(&quot;dog&quot;)
res11: Boolean = true
</pre></p>
<p>Let&#8217;s say you had a list of English words and wanted to look up their corresponding German words, and you want to protect yourself against the <strong>NoSuchElementException</strong>. One way to do this is to filter the words using <strong>contains</strong>, and then map the remaining ones through <em>engToDeu</em>.</p>
<p><pre class="brush: scala;">
scala&gt; val wordsToTranslate = List(&quot;dog&quot;,&quot;bird&quot;,&quot;cat&quot;,&quot;armadillo&quot;)
wordsToTranslate: List[java.lang.String] = List(dog, bird, cat, armadillo)

scala&gt; wordsToTranslate.filter(x=&gt;engToDeu.contains(x)).map(x=&gt;engToDeu(x))
res12: List[java.lang.String] = List(Hund, Katze)
</pre></p>
<p>This is a useful ways of safely applying a Map to a list of items. However, we&#8217;ll see a better way to deal with missing values later on, using Options.</p>
<p>If you there is a sensible default value for any key you might try with your map, you can use the <strong>getOrElse</strong> method. You provide the key as the first argument, and then the default value as the second.</p>
<p><pre class="brush: scala;">

scala&gt; engToDeu.getOrElse(&quot;dog&quot;,&quot;???&quot;)
res1: java.lang.String = Hund

scala&gt; engToDeu.getOrElse(&quot;armadillo&quot;,&quot;???&quot;)
res2: java.lang.String = ???
</pre></p>
<p>It is quite common to use <strong>getOrElse</strong> with a default of 0 for Maps that contain statistics, such as word counts (see below), where the absence of a key naturally indicates that it has, e.g., a count of zero.</p>
<p>If you have a consistent default value for any keys that aren&#8217;t in the Map, you can set it by using the withDefault method.</p>
<p><pre class="brush: scala;">

scala&gt; val engToDeu = Map((&quot;dog&quot;,&quot;Hund&quot;), (&quot;cat&quot;,&quot;Katze&quot;), (&quot;rhinoceros&quot;,&quot;Nashorn&quot;)).withDefault(x =&gt; &quot;???&quot;)
engToDeu: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(dog -&gt; Hund, cat -&gt; Katze, rhinoceros -&gt; Nashorn)

scala&gt; engToDeu(&quot;armadillo&quot;)
res3: java.lang.String = ???
</pre></p>
<p>Now you can ask for values in the usual manner, without needing to use <strong>getOrElse</strong> and providing the default every time.</p>
<h2>Keys and values in Maps</h2>
<p>You may have observed that Scala tells you more than that you have just created a Map. Like List, Map is a parameterized type, which means that it is a generic way of collecting a bunch of objects of particular types together. Above we saw an instance of a <strong>Map[String, String]</strong> (leaving off the <strong>java.lang</strong> part to make it clearer). The first String indicates that the <em>keys</em> are strings and the second that <em>values</em> are Strings. Basically, any type can be used in either position (<em>warning</em>: you should avoid using mutable data structures as keys unless you know what you are doing). Here are some examples (try to ignore the <strong>scala.collection.immutable</strong> and <strong>java.lang</strong> parts and just focus on the <strong>Map[X,Y]</strong> signatures we get).</p>
<p><pre class="brush: scala;">
scala&gt; Map((10,&quot;ten&quot;), (100,&quot;one hundred&quot;))
res0: scala.collection.immutable.Map[Int,java.lang.String] = Map(10 -&gt; ten, 100 -&gt; one hundred)

scala&gt; Map((&quot;a&quot;,1),(&quot;b&quot;,2))
res1: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -&gt; 1, b -&gt; 2)

scala&gt; Map((1,3.14), (2,6.28))
res2: scala.collection.immutable.Map[Int,Double] = Map(1 -&gt; 3.14, 2 -&gt; 6.28)

scala&gt; Map(((&quot;pi&quot;,1),3.14), ((&quot;tau&quot;,2),6.28))
res3: scala.collection.immutable.Map[(java.lang.String, Int),Double] = Map((pi,1) -&gt; 3.14, (tau,2) -&gt; 6.28)

scala&gt; Map((&quot;the&quot;,List(&quot;Determiner&quot;)),(&quot;book&quot;,List(&quot;Verb&quot;,&quot;Noun&quot;)),(&quot;off&quot;,List(&quot;Preposition&quot;,&quot;Verb&quot;)))
res4: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(the -&gt; List(Determiner), book -&gt; List(Verb, Noun), off -&gt; List(Preposition, Verb))
</pre></p>
<p>The last two examples show some very useful aspects of key and values types that allow you to use more complex keys and values. The former uses a <strong>(String, Int)</strong> pair as a key, with signature <strong>Map[(String, Int), Double]</strong>, and the latter uses a <strong>List[String]</strong> as the value, with signature <strong>Map[String, List[String]]</strong>. So you can bundle together several types using tuples and you can use parameterized data structures to parameterize another data structure.</p>
<h2>A simple translation task</h2>
<p>Here is a mini German/English dictionary as a Map.</p>
<p><pre class="brush: scala;">
scala&gt; val miniDictionary = Map((&quot;befreit&quot;,&quot;liberated&quot;),(&quot;baeche&quot;,&quot;brooks&quot;),(&quot;eise&quot;,&quot;ice&quot;),(&quot;sind&quot;,&quot;are&quot;),(&quot;strom&quot;,&quot;river&quot;),(&quot;und&quot;,&quot;and&quot;),(&quot;vom&quot;,&quot;from&quot;))
miniDictionary: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(und -&gt; and, eise -&gt; ice, sind -&gt; are, befreit -&gt; liberated, strom -&gt; river, vom -&gt; from, baeche -&gt; brooks)
</pre></p>
<p>We can provide a (very bad) translation of the German sentence &#8220;<em>vom eise befreit sind strom und baeche</em>&#8221; using this dictionary: we simply split the German sentence and then map over its elements, looking up each word in the dictionary.</p>
<p><pre class="brush: scala;">
scala&gt; val example = &quot;vom eise befreit sind strom und baeche&quot;
example: java.lang.String = vom eise befreit sind strom und baeche

scala&gt; example.split(&quot; &quot;).map(deuWord =&gt; miniDictionary(deuWord)).mkString(&quot; &quot;)
res0: String = from ice liberated are river and brooks
</pre></p>
<p>Okay, not quite &#8220;from the ice they are freed, the stream and brook&#8221; but then again it&#8217;s pretty much the dumbest machine translation approach available&#8230;</p>
<p>A danger of course is that we will have words that aren&#8217;t in the dictionary, leading to an exception.</p>
<p><pre class="brush: scala;">
scala&gt; val example2 = &quot;vom eise befreit sind strom und schiffe&quot;
example2: java.lang.String = vom eise befreit sind strom und schiffe

scala&gt; example2.split(&quot; &quot;).map(deuWord =&gt; miniDictionary(deuWord)).mkString(&quot; &quot;)
java.util.NoSuchElementException: key not found: schiffe
</pre></p>
<p>We&#8217;ll return to this below.</p>
<h2>Creating Maps from Lists using groupBy</h2>
<p>We frequently have data stored in a particular data structure and would like to work with it using another data structure that organizes the data points in some other manner. Here, we&#8217;ll look at how to convert a List into Map using the <strong>groupBy</strong> method in order to do some useful processing for working with parts-of-speech. We&#8217;ll also see the <strong>Set</strong> data structure along the way.</p>
<p>We&#8217;ll start with a very basic example of what <strong>groupBy</strong> does. Given a list of number tokens, we can obtain a Map from the number types to all of the tokens of each number.</p>
<p><pre class="brush: scala;">
scala&gt; val numbers = List(1,4,5,1,6,5,2,8,1,9,2,1)
numbers: List[Int] = List(1, 4, 5, 1, 6, 5, 2, 8, 1, 9, 2, 1)

scala&gt; numbers.groupBy(x=&gt;x)
res19: scala.collection.immutable.Map[Int,List[Int]] = Map(5 -&gt; List(5, 5), 1 -&gt; List(1, 1, 1, 1), 6 -&gt; List(6), 9 -&gt; List(9), 2 -&gt; List(2, 2), 8 -&gt; List(8), 4 -&gt; List(4))
</pre></p>
<p>As you can see from the result, <strong>groupBy</strong> took the anonymous function <strong>x=&gt;x</strong>, grouped all of the elements of the List that have the same value of <em>x</em>, and then created a Map from each <em>x</em> to the group containing its tokens. So, we get 2 mapping to a List containing 2&#8242;s, and so on. This probably seems a bit weird, but it is incredibly useful when we consider Lists that have more interesting elements in them. To do so, let&#8217;s go back to the part-of-speech tagging example from <a title="First steps in Scala for beginning programmers, Part 4" href="http://bcomposes.wordpress.com/2011/08/30/first-steps-in-scala-for-beginning-programmers-part-4/">Part 4 of these tutorials</a>. Say we have a sentence that is tagged with parts of speech, such as the following (made up) example that ensures some tag ambiguities.<br />
<em></em></p>
<p style="padding-left:30px;"><em>in the dark , a tall man saw the saw that he needed to man to cut the dark tree .</em></p>
<p>The parts-of-speech could be annotated as follows (with lots of simplifications, and apologies to any offense caused to anyone&#8217;s linguistic sensitivities).</p>
<p style="padding-left:30px;">in/Prep the/Det dark/Noun ,/Punc a/Det tall/Adjective man/Noun saw/Verb the/Det saw/Noun that/Pronoun he/Pronoun needed/Verb to/Prep man/Verb to/Prep cut/Verb the/Det dark/Adjective tree/Noun ./Punc</p>
<p><a title="First steps in Scala for beginning programmers, Part 4" href="http://bcomposes.wordpress.com/2011/08/30/first-steps-in-scala-for-beginning-programmers-part-4/">See Part 4</a> for detailed explanation of how the following expression turns a string like this into a List of tuples.</p>
<p><pre class="brush: scala;">
scala&gt; val tagged = &quot;in/Prep the/Det dark/Noun ,/Punc a/Det tall/Adjective man/Noun saw/Verb the/Det saw/Noun that/Pronoun he/Pronoun needed/Verb to/Prep man/Verb to/Prep cut/Verb the/Det dark/Adjective tree/Noun ./Punc&quot;.split(&quot; &quot;).toList.map(x =&gt; x.split(&quot;/&quot;)).map(x =&gt; (x(0), x(1)))
tagged: List[(java.lang.String, java.lang.String)] = List((in,Prep), (the,Det), (dark,Noun), (,,Punc), (a,Det), (tall,Adjective), (man,Noun), (saw,Verb), (the,Det), (saw,Noun), (that,Pronoun), (he,Pronoun), (needed,Verb), (to,Prep), (man,Verb), (to,Prep), (cut,Verb), (the,Det), (dark,Adjective), (tree,Noun), (.,Punc))
</pre></p>
<p>Now, let&#8217;s use <strong>groupBy</strong> in various ways on this. The first thing we might be interested in is seeing which parts of speech each word is associated with.</p>
<p><pre class="brush: scala;">
scala&gt; val groupedTagged = tagged.groupBy(x =&gt; x._1)
groupedTagged: scala.collection.immutable.Map[java.lang.String,List[(java.lang.String, java.lang.String)]] = Map(in -&gt; List((in,Prep)), needed -&gt; List((needed,Verb)), . -&gt; List((.,Punc)), cut -&gt; List((cut,Verb)), saw -&gt; List((saw,Verb), (saw,Noun)), a -&gt; List((a,Det)), man -&gt; List((man,Noun), (man,Verb)), that -&gt; List((that,Pronoun)), dark -&gt; List((dark,Noun), (dark,Adjective)), to -&gt; List((to,Prep), (to,Prep)), , -&gt; List((,,Punc)), tall -&gt; List((tall,Adjective)), he -&gt; List((he,Pronoun)), tree -&gt; List((tree,Noun)), the -&gt; List((the,Det), (the,Det), (the,Det)))
</pre></p>
<p>So, now you see that the keys in the Map constructed by <strong>groupBy</strong> are the words and the values are the groups of the original elements. You can then see that the anonymous function<strong> x =&gt; x._1</strong> provided to <strong>groupBy</strong> does two things: it specifies the part of the input elements that will group different items together and it specifies that that part of the input defines the key space.</p>
<p>However, we don&#8217;t quite have what we want, which is to have the set of parts of speech associated with each word. Instead we have a List of tuples, e.g.:</p>
<p><pre class="brush: scala;">
scala&gt; groupedTagged(&quot;saw&quot;)
res21: List[(java.lang.String, java.lang.String)] = List((saw,Verb), (saw,Noun))
</pre></p>
<p>Focussing on just this for a moment, we can map this and produce a List with just the parts-of-speech, and then turn that List into a Set with the <strong>toSet</strong> method in order to get just the unique parts-of-speech.</p>
<p><pre class="brush: scala;">
scala&gt; groupedTagged(&quot;saw&quot;).map(x=&gt;x._2)
res24: List[java.lang.String] = List(Verb, Noun)

scala&gt; groupedTagged(&quot;saw&quot;).map(x=&gt;x._2).toSet
res25: scala.collection.immutable.Set[java.lang.String] = Set(Verb, Noun)
</pre></p>
<p>Converting the List to a Set didn&#8217;t do much here, but consider <em>the</em>, which has multiple tokens with the same part-of-speech.</p>
<p><pre class="brush: scala;">
scala&gt; groupedTagged(&quot;the&quot;)
res26: List[(java.lang.String, java.lang.String)] = List((the,Det), (the,Det), (the,Det))

scala&gt; groupedTagged(&quot;the&quot;).map(x=&gt;x._2)
res27: List[java.lang.String] = List(Det, Det, Det)

scala&gt; groupedTagged(&quot;the&quot;).map(x=&gt;x._2).toSet
res28: scala.collection.immutable.Set[java.lang.String] = Set(Det)
</pre></p>
<p>Sets are yet another of the useful data structures you have to work with, along with Maps and Lists. They work just like you would expect Sets to: they contain a collection of unique, unordered elements, and they allow you to see whether an element is in the set, whether one set is a subset of another, iterate over their elements, etc.</p>
<p>Now, back to getting from the word/tag pairs to a mapping from words to possible tags for each word. The keys we got from <strong>tagged.groupBy(x =&gt; x._1) </strong> are what we want, but we want to transform the values from Lists of word/tag tokens to Sets of tags, which we can do with the <strong>mapValues</strong> method on Maps.</p>
<p><pre class="brush: scala;">
scala&gt; val wordsToTags = tagged.groupBy(x =&gt; x._1).mapValues(listOfWordTagPairs =&gt; listOfWordTagPairs.map(wordTagPair =&gt; wordTagPair._2).toSet)
wordsToTags: scala.collection.immutable.Map[java.lang.String,scala.collection.immutable.Set[java.lang.String]] = Map(in -&gt; Set(Prep), needed -&gt; Set(Verb), . -&gt; Set(Punc), cut -&gt; Set(Verb), saw -&gt; Set(Verb, Noun), a -&gt; Set(Det), man -&gt; Set(Noun, Verb), that -&gt; Set(Pronoun), dark -&gt; Set(Noun, Adjective), to -&gt; Set(Prep), , -&gt; Set(Punc), tall -&gt; Set(Adjective), he -&gt; Set(Pronoun), tree -&gt; Set(Noun), the -&gt; Set(Det))
</pre></p>
<p>The bit inside the <strong>mapValues(&#8230;)</strong> part will have some readers scrunching up their eyes, but you just need to look at the line where we got <span style="text-decoration:underline;">res28</span> above: if you understood that, then you just need to realize we are doing exactly the same thing, but now in the context of mapping over the values rather than dealing with a single value. Now you know how to map over values that you are mapping over.</p>
<p>Now that it is hand, we can easily query the <em>wordsToTags</em> Map to see whether various words have various tags.</p>
<p><pre class="brush: scala;">
scala&gt; wordsToTags(&quot;man&quot;)(&quot;Noun&quot;)
res8: Boolean = true

scala&gt; wordsToTags(&quot;man&quot;)(&quot;Det&quot;)
res9: Boolean = false

scala&gt; wordsToTags(&quot;man&quot;)(&quot;Verb&quot;)
res10: Boolean = true

scala&gt; wordsToTags(&quot;saw&quot;)(&quot;Verb&quot;)
res11: Boolean = true
</pre></p>
<p>This is an example of how data structures within data structures (here Sets within a Map) are quite useful. (<em>Exercise</em>: think about what a tree is for a moment and how you might implement it using Lists.)</p>
<p>There are a variety of things you can do in computational linguistics with Maps from words to their parts-of-speech. A simple example is to compute the average number of tags per word type.</p>
<p><pre class="brush: scala;">
scala&gt; val avgTagsPerType = wordsToTags.values.map(x=&gt;x.size).sum/wordsToTags.size.toDouble
avgTagsPerType: Double = 1.2
</pre></p>
<p>If it isn&#8217;t clear to you what is going on here, tease it apart in your own REPL!</p>
<p>We can turn our word/tag pairs the other way to find out which words go with each part-of-speech. The only thing we need to do is <strong>groupBy</strong> on the second element of each pair, and then map the List values to their first element and get a Set from those.</p>
<p><pre class="brush: scala;">
scala&gt; val tagsToWords = tagged.groupBy(x =&gt; x._2).mapValues(listOfWordTagPairs =&gt; listOfWordTagPairs.map(wordTagPair =&gt; wordTagPair._1).toSet)
tagsToWords: scala.collection.immutable.Map[java.lang.String,scala.collection.immutable.Set[java.lang.String]] = Map(Prep -&gt; Set(in, to), Det -&gt; Set(the, a), Noun -&gt; Set(dark, man, saw, tree), Pronoun -&gt; Set(that, he), Verb -&gt; Set(saw, needed, man, cut), Punc -&gt; Set(,, .), Adjective -&gt; Set(tall, dark))
</pre></p>
<p>This basic paradigm is a powerful one for flipping between different data structures depending on what our needs are. It also demonstrates several important concepts with working with Lists, Maps and Sets. The next section shows a simple application of this idea for counting words in a text.</p>
<h2>Counting words</h2>
<p>A common task in computational linguistics is to calculate word statistics, and the most basic of those is to count the number of tokens of each word type in a particular text. The most common way to store and access those counts is in a Map, but how do you create such a Map from a given text? If we look at a text as a list of strings, then the <strong>groupBy</strong> paradigm we did above gives us exactly what we need &#8212; in fact it is even simpler than the word/tag manipulations done above.</p>
<p>The example text we&#8217;ll use is the tongue-twister about woodchucks.</p>
<p><pre class="brush: scala;">
scala&gt; val woodchuck = &quot;how much wood could a woodchuck chuck if a woodchuck could chuck wood ? as much wood as a woodchuck would , if a woodchuck could chuck wood .&quot;
woodchuck: java.lang.String = how much wood could a woodchuck chuck if a woodchuck could chuck wood ? as much wood as a woodchuck would , if a woodchuck could chuck wood .
</pre></p>
<p>Given this, here&#8217;s how we can compute the number of occurrences of each word type. First we <strong>groupBy</strong> on the elements. Though a list of strings isn&#8217;t as interesting as having a list of Tuples as we had with words and tags, it still produces a useful result: we now have a unique set of keys corresponding to the types of elements found in the Array, and there is a corresponding value to each one that is the Array of tokens of that type.</p>
<p><pre class="brush: scala;">
scala&gt; woodchuck.split(&quot; &quot;).groupBy(x=&gt;x)
res29: scala.collection.immutable.Map[java.lang.String,Array[java.lang.String]] = Map(woodchuck -&gt; Array(woodchuck, woodchuck, woodchuck, woodchuck), chuck -&gt; Array(chuck, chuck, chuck), . -&gt; Array(.), would -&gt; Array(would), if -&gt; Array(if, if), a -&gt; Array(a, a, a, a), as -&gt; Array(as, as), , -&gt; Array(,), how -&gt; Array(how), much -&gt; Array(much, much), wood -&gt; Array(wood, wood, wood, wood), ? -&gt; Array(?), could -&gt; Array(could, could, could))
</pre></p>
<p>And, we want to do something much simpler than what we did with the part-of-speech example: we just need to count the length of each list, since they each contain every token of the corresponding word type. The function passed to <strong>mapValues</strong> is thus quite a bit simpler than the ones given in the previous section.</p>
<p><pre class="brush: scala;">
scala&gt; val counts = woodchuck.split(&quot; &quot;).groupBy(x=&gt;x).mapValues(x=&gt;x.length)
counts: scala.collection.immutable.Map[java.lang.String,Int] = Map(woodchuck -&gt; 4, chuck -&gt; 3, . -&gt; 1, would -&gt; 1, if -&gt; 2, a -&gt; 4, as -&gt; 2, , -&gt; 1, how -&gt; 1, much -&gt; 2, wood -&gt; 4, ? -&gt; 1, could -&gt; 3)
</pre></p>
<p>With <em>counts</em>, we can now access the frequencies of any of the words that were in the text.</p>
<p><pre class="brush: scala;">
scala&gt; counts(&quot;woodchuck&quot;)
res5: Int = 4

scala&gt; counts(&quot;could&quot;)
res6: Int = 3
</pre></p>
<p>Easy!  Of course, we normally want to build word counts for texts that are longer and are stored in a file rather than explicitly added to Scala code. The next tutorial will demonstrate how to do that.</p>
<h2>Iterating over the keys and values in a Map</h2>
<p>The material above shows some useful aspects of Maps, but of course there is much more you can do with them, often requiring iterating through the key-value pairs in the Map. We&#8217;ll use the <em>counts</em> Map created above for demonstrating this.</p>
<p>You can access just the keys, or just the values.</p>
<p><pre class="brush: scala;">
scala&gt; counts.keys
res0: Iterable[java.lang.String] = Set(woodchuck, chuck, ., would, if, a, as, ,, how, much, wood, ?, could)

scala&gt; counts.values
res1: Iterable[Int] = MapLike(4, 3, 1, 1, 2, 4, 2, 1, 1, 2, 4, 1, 3)
</pre></p>
<p>Notice that these are both Iterable data structures, so we can do all of the usual mapping, filtering, and so on, that we have already done with lists. (You may convert them to Lists if you like using <strong>toList</strong>, of course.)</p>
<p>You can print out all of the key -&gt; value pairs in the Map in a number of ways. One is to use a for expression.</p>
<p><pre class="brush: scala;">
scala&gt; for ((k,v) &lt;- counts) println(k + &quot; -&gt; &quot; + v)
woodchuck -&gt; 4
chuck -&gt; 3
. -&gt; 1
would -&gt; 1
if -&gt; 2
a -&gt; 4
as -&gt; 2
, -&gt; 1
how -&gt; 1
much -&gt; 2
wood -&gt; 4
? -&gt; 1
could -&gt; 3
</pre></p>
<p>And here are other ways to achieve the same result (output omitted since it is the same).</p>
<p><pre class="brush: scala;">
for (k &lt;- counts.keys) println(k + &quot; -&gt; &quot; + counts(k))
counts.map(kvPair =&gt; kvPair._1 + &quot; -&gt; &quot; + kvPair._2).foreach(println)
counts.keys.map(k =&gt; k + &quot; -&gt; &quot; + counts(k)).foreach(println)
counts.foreach { case(k,v) =&gt; println(k + &quot; -&gt; &quot; + v) }
counts.foreach(kvPair =&gt; println(kvPair._1 + &quot; -&gt; &quot; + kvPair._2))
</pre></p>
<p>And so on. Basically, you are able to step through the Map one key-value pair at a time, or you can grab the set of keys and then step through those and access the values from the map. Which form you use depends on what you need &#8212; for example, the <strong>foreach</strong> construct doesn&#8217;t return a value, but the <strong>for</strong> expressions and the <strong>map</strong> expressions do return values. Why would you do that? Well, as an example, consider grouping all words that have occurred the same number of times.</p>
<p><pre class="brush: scala;">
scala&gt; val countsToWords = counts.keys.toList.map(k =&gt; (counts(k),k)).groupBy(x=&gt;x._1).mapValues(x=&gt;x.map(y=&gt;y._2))
countsToWords: scala.collection.immutable.Map[Int,List[java.lang.String]] = Map(3 -&gt; List(chuck, could), 4 -&gt; List(woodchuck, a, wood), 1 -&gt; List(., would, ,, how, ?), 2 -&gt; List(if, as, much))
</pre></p>
<p>We go from a Map to a Set of its keys to a List of those keys to a List of Tuples of the values and the keys to a Map from the values of the original Map to such Tuples, and then we map the values of the new map to just contain the words (the original keys). (That&#8217;s a mouthful, so try each step in the REPL to see what is going on in detail.)</p>
<p>Now we can output <em>countsToWords</em> sorted in descending numerical order by count, and then by alphabetical order by word within each count.</p>
<p><pre class="brush: scala;">
scala&gt; countsToWords.keys.toList.sorted.reverse.foreach(x =&gt; println(x + &quot;: &quot; + countsToWords(x).sorted.mkString(&quot;,&quot;)))
4: a,wood,woodchuck
3: chuck,could
2: as,if,much
1: ,,.,?,how,would
</pre></p>
<h2>Options and flatMapping for dealing with missing keys</h2>
<p>I pointed out toward the start of this tutorial that we run into trouble if we ask for a key that doesn&#8217;t exist in a Map. Let&#8217;s go back to the <em>engToDeu</em> Map we began with.</p>
<p><pre class="brush: scala;">
scala&gt; val engToDeu = Map((&quot;dog&quot;,&quot;Hund&quot;), (&quot;cat&quot;,&quot;Katze&quot;), (&quot;rhinoceros&quot;,&quot;Nashorn&quot;))
engToDeu: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(dog -&gt; Hund, cat -&gt; Katze, rhinoceros -&gt; Nashorn)

scala&gt; engToDeu(&quot;dog&quot;)
res0: java.lang.String = Hund

scala&gt; engToDeu(&quot;bird&quot;)
java.util.NoSuchElementException: key not found: bird
</pre></p>
<p>There is another way of accessing the elements of a Map, using the <strong>get</strong> method.</p>
<p><pre class="brush: scala;">
scala&gt; engToDeu.get(&quot;dog&quot;)
res2: Option[java.lang.String] = Some(Hund)

scala&gt; engToDeu.get(&quot;bird&quot;)
res3: Option[java.lang.String] = None
</pre></p>
<p>Now, the return value is an <strong>Option[String]</strong>. An <strong>Option</strong> is either a <strong>Some</strong> that contains a value or a <strong>None</strong>, which means there is no value. If you want to get the value out of a Some, you use the <strong>get</strong> method on Options.</p>
<p><pre class="brush: scala;">
scala&gt; val dogTrans = engToDeu.get(&quot;dog&quot;)
dogTrans: Option[java.lang.String] = Some(Hund)

scala&gt; dogTrans.get
res4: java.lang.String = Hund
</pre></p>
<p>If you just use <strong>get</strong> on a Map to obtain an Option and then immediately call <strong>get</strong> on the Option, we get the same behavior we had before.</p>
<p><pre class="brush: scala;">
scala&gt; engToDeu.get(&quot;dog&quot;).get
res6: java.lang.String = Hund

scala&gt; engToDeu.get(&quot;bird&quot;).get
java.util.NoSuchElementException: None.get
</pre></p>
<p>So, at this point, you are probably thinking that this sounds like a waste of time that is just making things more complex. Wait! It actually is tremendously useful because of pattern matching and the way many methods on sequences work.</p>
<p>First, here is how you can write a protected form of translating the words in a list without getting an exception.</p>
<p><pre class="brush: scala;">
scala&gt; wordsToTranslate.foreach { x =&gt; engToDeu.get(x) match {
|   case Some(y) =&gt; println(x + &quot; -&gt; &quot; + y)
|   case None =&gt;
| }}
dog -&gt; Hund
cat -&gt; Katze
</pre></p>
<p>I know&#8230; this probably still isn&#8217;t convincing &#8212; it still looks more involved than the conditional we used (far) above to check whether <em>engToDeu</em> contained a given key (at least for this particular example). Hold on&#8230; because now we are just about ready for things to get simpler, and learn some useful things about Lists in doing so.</p>
<p>First, you should know about a great method on Lists called <strong>flatten</strong>. If you have a List of Lists of Strings, you can use <strong>flatten</strong> to get a single List of Strings. Consider the following example, in which we flatten a List of Lists of Strings and make a single String out of the result with <strong>mkString</strong>. Notice that the empty List in the third spot of the main List just disappears when we flatten it.</p>
<p><pre class="brush: scala;">
scala&gt; val sentences = List(List(&quot;Here&quot;,&quot;is&quot;,&quot;sentence&quot;,&quot;one&quot;,&quot;.&quot;),List(&quot;The&quot;,&quot;third&quot;,&quot;sentence&quot;,&quot;is&quot;,&quot;empty&quot;,&quot;!&quot;),List(),List(&quot;Lastly&quot;,&quot;,&quot;,&quot;we&quot;,&quot;have&quot;,&quot;a&quot;,&quot;final&quot;,&quot;sentence&quot;,&quot;.&quot;))
sentences: List[List[java.lang.String]] = List(List(Here, is, sentence, one, .), List(The, third, sentence, is, empty, !), List(), List(Lastly, ,, we, have, a, final, sentence, .))

scala&gt; sentences.flatten
res0: List[java.lang.String] = List(Here, is, sentence, one, ., The, third, sentence, is, empty, !, Lastly, ,, we, have, a, final, sentence, .)

scala&gt; sentences.flatten.mkString(&quot; &quot;)
res1: String = Here is sentence one . The third sentence is empty ! Lastly , we have a final sentence .
</pre></p>
<p>Flattening in general is pretty useful in its own right. Where it comes to play with Option values is that Options can be thought of a Lists: Somes are like one element Lists and Nones are like empty Lists. So, when you have a List of Options, the flatten method gives you the value in a Some and any Nones just drop away.</p>
<p><pre class="brush: scala;">
scala&gt; wordsToTranslate.map(x =&gt; engToDeu.get(x))
res12: List[Option[java.lang.String]] = List(Some(Hund), None, Some(Katze), None)

scala&gt; wordsToTranslate.map(x =&gt; engToDeu.get(x)).flatten
res13: List[java.lang.String] = List(Hund, Katze)
</pre></p>
<p>This is such a generally useful paradigm that there is a function <strong>flatMap</strong> which does exactly this.</p>
<p><pre class="brush: scala;">
scala&gt; wordsToTranslate.flatMap(x =&gt; engToDeu.get(x))
res14: List[java.lang.String] = List(Hund, Katze)
</pre></p>
<p>So, returning to the translation example above, we can now safely skip on by &#8220;<em>schiffe</em>&#8221; without fuss.</p>
<p><pre class="brush: scala;">
scala&gt; example2.split(&quot; &quot;).flatMap(deuWord =&gt; miniDictionary.get(deuWord)).mkString(&quot; &quot;)
res15: String = from ice liberated are river and
</pre></p>
<p>Whether this is the desired behavior in this particular case is another question (e.g. you really should be doing some special unknown word handling). Nonetheless, you&#8217;ll find that <strong>flatMap</strong> is quite handy in general for this sort of pattern, in which a list of elements is used to retrieve values from a Map that will be missing some of those values.</p>
<p>An example of the further use of Options and <strong>flatMap</strong> is that you also may create functions that return Options and are thus amenable to flatMapping. Consider a function that squares only odd numbers and throws evens away (<em>note</em>: the % operator is the modulo operator that finds the remainder of division of one number by another &#8212; try it in the REPL).</p>
<p><pre class="brush: scala;">

scala&gt; def squareOddNumber (x: Int) = if (x % 2 != 0) Some(x*x) else None
squareOddNumber: (x: Int)Option[Int]

</pre></p>
<p>If you <strong>map</strong> over the numbers 1 to 10, you&#8217;ll see the Somes and Nones, and if you <strong>flatMap</strong> it, you get exactly the desired result of the squares of all the odd numbers without any pollution from the evens.</p>
<p><pre class="brush: scala;">
scala&gt; (1 to 10).toList.map(x=&gt;squareOddNumber(x))
res16: List[Option[Int]] = List(Some(1), None, Some(9), None, Some(25), None, Some(49), None, Some(81), None)

scala&gt; (1 to 10).toList.flatMap(x=&gt;squareOddNumber(x))
res17: List[Int] = List(1, 9, 25, 49, 81)
</pre></p>
<p>This turns out to be amazingly useful and common, so much so that the expression &#8220;<a href="http://legendofklang.spreadshirt.se/men-s-legendofklang-flatmap-that-sh-t-A15944995">just flatMap that shit</a>&#8221; has become a common refrain among Scala programmers. <a href="https://gist.github.com/1090775">Scala programmers even write scripts to remind them to do it.</a> :)</p>
<p><span style="color:#888888;">Copyright 2011 Jason Baldridge</span></p>
<p><span style="color:#888888;">The text of this tutorial is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License. Attribution may be provided by linking to www.jasonbaldridge.com and to this original tutorial.</span></p>
<p><span style="color:#888888;">Suggestions, improvements, extensions and bug fixes welcome — please email Jason at jasonbaldridge@gmail.com or provide a comment to this post.</span></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/bcomposes.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/bcomposes.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/bcomposes.wordpress.com/121/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=bcomposes.wordpress.com&amp;blog=24939937&amp;post=121&amp;subd=bcomposes&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>https://bcomposes.wordpress.com/2011/09/12/first-steps-in-scala-for-beginning-programmers-part-7/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="https://secure.gravatar.com/avatar/1a5262c6575daf92f62f5cc6d9f1f00f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jasonbaldridge</media:title>
		</media:content>
	</item>
	</channel>
</rss>
