High-Tech Institute // C++
fundamentals // Module xx (extra
exercises)
Kris van Rens
In this exercise, you’ll implement a simple calculator using Reverse Polish Notation (RPN).
You will likely be most familiar to algebraic notation, or infix notation, so first read about RPN first to get an idea of what it is and why it’s easy to implement an interpreter for it. It will take a bit of getting used to RPN before really appreciating it.
\(5 + 2\)
\(7 * 21 - 9\)
\((220 / 7 + 10) * 4\)
\(5 \space 2 \space +\)
\(7 \space 21 \space * \space 9 \space -\)
\(220 \space 7 \space / \space 1 \space + \space 4 \space *\)
Each operand is pushed onto the stack, each operator will pop the operands it needs off the stack and pushes its result back onto the stack. It never needs more than two memory cells.
For the sake of keeping the end result under ±300 lines of C++ code, we’ll keep it simple:
+
/ -
/ *
/ /
/
%
There’s a suite of example tests in the solution code that can be used to validate your program.
The solution code is split up in a single version for each step, but
generally speaking each solution comprises of a single .cpp
file. Feel free to change this to your liking, or divert from the
exercise in any other way!
The goal is to have fun and learn something!
Feel free to extend your version at will, my solution code is just one of many possibilities.
As a first basic step, implement a program to test your build setup.
It doesn’t have to be functional otherwise.
Use an IDE to create a C++ project, build it manually or use a build system like Cmake.
For example: manual build, e.g. using GCC
:
$ g++ -Wall -Wextra -Wconversion -Werror -O3 -std=c++20 -o rpn-calculator rpn-calculator.cpp
Feel free to steal the Cmake configuration from the solution code.
Add the code to read a string of user input from the command-line terminal.
In the best case, your program can accept any data from “standard input”.
This can be used for testing:
echo
:
echo '5 2 *' | ./rpn-calculator
./rpn-calculator <<< '5 2 *'
Add the code to categorize the input string. Classify the code input data as either:
An operand (i.e. signed integer numbers),
An operator (i.e. one of the supported operation symbols),
Invalid data.
The next step in input classification is tokenization: the process of wrapping the classified data into distinct C++ types. With this step we basically “plug in” to the C++ type system and benefit from it.
When modeling the data type(s) for a token, consider the number of states it can have and how many are valid. Try to reach an optimal mapping by minimizing the amount of possible invalid states.
Add the code to parse the string data from the token into value that can be used in calculations.
Note: prefer use of C++17’s std::from_chars
over the C-style atoi
/atol
/… functions.
Add the code to perform a single calculation using the parsed and tokenized user input.
Hint: a switch
statement can be used
over integral and enumerator types, or any type contextually implicitly
convertible to an integral or enumerator type.
Add the code for proper stack-based memory to store operands and results.
Note: take a look at available container types in the C++ standard library.
In order to accept each operand separately, or accept multiple calculation requests in interactive mode, we need some kind of state machine. Set up the state machine “happy path” first.
Add the code to deal with each erroneous input in a graceful way.
You may have noticed, broadly speaking, we have two classes of errors:
Fatal errors resulting from memory issues or other internal exceptions (log + stop the program),
Calculation errors resulting from user input mistakes (log + reset state machine).
Add the code to deal with the two classes of errors separately and correctly.
Let’s get back to the earlier observation that the
struct Token
containing an enum class
as a
discriminant and a std::string value
as data field is not
ideal. For example, suppose the input token is Invalid
, we
can still call parse
which then fails. This data structure
is just a bad mapping on the problem space.
We need a “true enumerator”, or “discriminant union”. In C++ we can
use std::variant
for that.
Each token type will be represented by its own C++ type. This hugely improves the data structure mapping.
Rewrite the Token
type to be a
std::variant
. Each state can keep its own state-specific
data, and can have its own state-specific behavior.
std::variant
In std::variant
, the states are “visited” using
std::visit
. This is a bit of a clunky method.
Use the following setup to visit states using lambdas:
namespace States { struct A {}; struct B {}; struct C {}; } using State = std::variant<States::A, States::B, States::C}; template<typename... Ts> struct overload : Ts... { using Ts::operator()...; };
State s = States::A{}; std::visit(overload{ [](States::A) { // ..do state 'A' stuff.. }, [](States::B) { // ..do state 'B' stuff.. }, [](States::C) { // ..do state 'C' stuff.. }, }, s);
It uses the overload
class template to choose the right
lambda function call.
In order to support longer calculations, like \(7 \space 21 \space * \space 9 \space -\), we must adapt the state machine flow.
The result of each calculation is stored on the stack, as if it were the first operand. We need an alternative way to detect the calculation end, which will influence the behavior of the calculator, especially the interactive mode.
Similar to how we changed the Token
type into a
std::variant
, we can change the States
enumerator.
When adding state-specific data or behavior later, this can be done cleanly.
Find situations where calculations fail, and add code to remediate / gracefully fail.
For example, what happens if we enter:
\(1 \space 0 \space /\)
\(999999999999999999999999999 \space 1 \space +\)
\(999999999999999999999 \space 999999999999999999999 \space *\)
For future improvements / enhancements, we’d like to make the
calculate
function type-independent.
Add the code and checks to make calculate
generic over
data types.
Similar to calculate
, we want to make
Token::Operand::parse
type-independent.
Add the code and potential checks to make
Token::Operand::parse
generic over data types.
Up until now we’ve used the std::stack
type from the C++
standard library. In the C++ course, we wrote our own Stack
class template type, and even created a partial specialization on a
maximum size. This custom stack may well be much more resource-efficient
than std::stack
.
Add the custom Stack
implementation specialized for a
maximum stack size of 2.
All the functions in our code base until now should be unit-tested. This can be done using a test framework like Doctest, Google Test or Catch2. Unfortunately, C++ itself does not offer much facilities for this
Add code to unit-test the functions (calculate
,
Token::Operand::parse
, read_token
) in the
program.
Now that Token::Operand::parse
and
calculate
are type-independent, we can switch our base data
type.
Switch the base data type from long
to
float
.
What happens to the parsing code? Is the rest of the program still valid?
High Tech Institute // C++ fundamentals // Module xx (extra exercises)