Back
Featured image of post C# Tutoring Subject: SetLoader

C# Tutoring Subject: SetLoader

Tutoring (soutien) subject for the 2021-03-25 session in C#

C# Soutien/Tutoring TP – SetLoader

This mini-subject intends to give you an example of how to use and why use some of the features we showed you.

In this subject, we will go over:

  • File I/O
  • Functions as parameter
  • Genericity
  • Sets

The goal of this subject is to implement a program that parses .set files and fills a HashSet with it.

Skeleton

Here is the basic scaffolding for this TP:

public static class SetParser
{
    // ========================================
    //                STAGE 1
    // ========================================

    public static List<String> LoadAndFilterLines(string path)
    {
        // FIXME
        throw new NotImplementedException();
    }

    // ========================================
    //                STAGE 2
    // ========================================

    public static List<string> LoadAndExpand(string filePath)
    {
        // FIXME
        throw new NotImplementedException();
    }

    // ========================================
    //                STAGE 3
    // ========================================

    public static HashSet<string> LinesToSet(List<string> lines)
    {
        // FIXME
        throw new NotImplementedException();
    }

    public static HashSet<string> LoadStringSet(string filePath)
    {
        // FIXME
        throw new NotImplementedException();
    }

    // ========================================
    //                STAGE 4
    // ========================================

    public static HashSet<T> LinesToSet<T>(List<string> lines, Func<string, T> transformer)
    {
        // FIXME
        throw new NotImplementedException();
    }

    public static HashSet<T> LoadSet<T>(string filePath)
    {
        // FIXME
        throw new NotImplementedException();
    }
}

.set format

This subject uses a fictional file type called the dot-set format.

The dot-set format is used to represent sets. The format is:

  • All elements are made of non-special ASCII characters (A-Z, a-z, 0-9, all basic punctuation)
  • One element per line. Line breaks can be either \r\n (Windows format) or \n. A parser must handle both.
    • Empty lines must be ignored
    • Lines that immediately start with a # must be ignored: they represent comments, except in the following cases.
      • Lines that start with #= represent a file import: those must not be ignored.
    • Space and tabs characters at the beginning of a line must be ignored, e.g. \t \t hello must be considered as just hello.
  • Elements can be turned into any value. The transformation process is up to the implementation.
    • Duplicated values (i.e. elements that lead to the same value) must be ignored (i.e. only add them once).
    • If an element starts with #=, it must be processed specially. Read the file with the path next to the #=, relative to the location of the current file and append all of its values into the set we are currently constructing. If a file is designated twice, it must be imported only once.

Here is an example:

Matthieu
Paul
Thomas

# I am a comment

Zoroark
    Cobaltarrena
Qwarks

Zoroark
Cobaltarrena
Qwarks

In the end, we should get a set containing six values: Matthieu, Paul, Thomas, Zoroark, Cobatlarrena and Qwarks.

Lines Reader + Filter

This function will be responsible for reading our file and excluding lines we’re not interested in.

Implement a function that returns a list of lines:

  • Reads each line from the file at path. You must handle both \r\n and \n line endings.
  • If the line is empty or starts with a #, unless it starts with #=, go to the next line.ww
  • Remove spaces and tabs at the beginning of the line.
  • Add the line (without the line ending character(s)) to the end of the list.
public static List<String> LoadAndFilterLines(string path)

Stage 2: Expanding

The Problem

We now need to implement an expansion mechanism. The idea behind it is to replace all lines that start with a #= with the values contained in the file name next to it. For example:

# file1.set
One
Two
#=file2.set
Four

# file2.set
Three

Our end result should contain, in that order, One, Two, Three and Four.

However, it is possible to have cyclic dependencies, such as:

# file1.set
One
Two
#=file2.set

# file2.set
Three
#=file3.set

# file3.set
Four
#=file1.set

However, the format tells us that we must only import any file at most once, so this is not really a problem. The end result here would be One, Two, Three and Four.

We do have to be careful about a specific case though, and that’s the following:

# file1.set
One
#=file2.set
#=./file2.set

# file2.set
Two

Although file2.set and ./file2.set are different strings, they represent the same file, and the end result must be One, Two.

Note that paths are also resolved relative to where we currently are. In the examples above, this just means that we will lok for ./file2.set in the directory where file1.set is.

The Solution

If we think about the problem at hand, our function could be summed up like this:

load_file(f: real path of the file, already_loaded: list) => list:
  if f is in already_loaded => return empty list
  add f to already_loaded
  load lines of f
  create an output list
  for each line of f:
    if the line represents an import:
      compute the real path of the line compared to f
      add everything contained in load_file(the file to import) to the output list
    else:
      add the line to the output list

Implement this algorithm in C# in the following function:

public static List<string> LoadAndExpand(string filePath);

You can use the function you previously wrote.

For computing real paths, you will need:

Stage 3: Transformation

Now that we are able to get a clean list of lines, it’s time to construct our set. The case of strings is the easiest one by far. Simply consider that the transformation is just leaving the string intact.

public static HashSet<string> LinesToSet(List<string> lines);

Let’s also make the final function that takes care of loading and expanding our files, then turns them into a set of strings:

public static HashSet<string> LoadStringSet(string filePath);

Stage 4: Generic Transformation

Let’s now make a generic version. We want to be able to handle anything, but we do not necessarily know how to create “something” from a string, since that “something” can be anything.

The solution here is to add a transformation function as a parameter of our functions, like so:

public static HashSet<T> LinesToSet<T>(List<string> lines, Func<string, T> transformer);

And make the final function.

public static HashSet<T> LoadSet<T>(string filePath);
Built with Hugo
Theme Stack designed by Jimmy