385 lines
13 KiB
ReStructuredText
385 lines
13 KiB
ReStructuredText
2019-10-30
|
|
**********
|
|
|
|
What is Sheerka ?
|
|
"""""""""""""""""
|
|
|
|
Sheerka is a *communication* language,
|
|
as opposed to the traditional *programming* languages. Its
|
|
purpose is to ease the communication between the (wo)man and the machine,
|
|
ultimately using the voice. I will first use it to program faster, and maybe
|
|
more easily.
|
|
|
|
.. _ulysse31: https://fr.wikipedia.org/wiki/Ulysse_31
|
|
|
|
Where does the name Sheerka came from ?
|
|
"""""""""""""""""""""""""""""""""""""""
|
|
Sheerka is my misspell of Shyrka, from my childhood anime ulysse31_.
|
|
For those you don't know this old cartoon, it's the Odyssey story from Homer,
|
|
ported in the 31st century. Ulysses has a spacecraft with an AI named Shyrka
|
|
|
|
I was a great fan of this cartoon when I was young. I thought that the idea of
|
|
bringing the ancient story of Ulysses in the future was a bright.
|
|
|
|
Ever since then, Sheerka was my reference for any sophisticated computer. Unfortunately
|
|
for me, at that time there was no wikipedia to tell the the correct spelling.
|
|
|
|
Model v0
|
|
""""""""
|
|
In my view, the beginning of everything are the **Events**. Basically, they are the commands (ie requests)
|
|
entered by the users.
|
|
|
|
The events are parsed, to understand what is required, so they produce a new **State**.
|
|
The state is a like a big dictionary that holds everything that is known by the system.
|
|
|
|
Most of the elements saved in the **State** are the **Concepts**. In this first version,
|
|
it's a little bit complicated to define what is the **Concept** as it can have several
|
|
usages. To make it simple, I will say that a **Concept** is an idea that can be
|
|
manipulated by the rest of the system.
|
|
I am pretty sure that its form and usage will evolve as I will manipulate
|
|
them
|
|
|
|
- Each **State** has a reference to the event(s) that trigger this state
|
|
- Each **State** has an **history**
|
|
- Each **Concept** has an **history**
|
|
|
|
|
|
An **history** is a triplet of
|
|
|
|
- user name
|
|
- modification date
|
|
- digest of the parent
|
|
|
|
.. _git: https://git-scm.com/
|
|
|
|
Personally, i have taken this way of tracking modification from how it's done on git_,
|
|
I guess Linux Torvarlds took it from somewhere.
|
|
|
|
|
|
2019-10-31
|
|
**********
|
|
|
|
More on Concepts
|
|
""""""""""""""""
|
|
To define a new concept
|
|
|
|
::
|
|
|
|
def concept hello a as "hello" + a
|
|
|
|
|
|
Note that the traditional quotes that would surround 'hello' and 'a' are not necessary.
|
|
In this example 'a' is a variable, as it appears as variable in the 'as' section (while hello
|
|
appears as a string)
|
|
|
|
So, you could call the concept by
|
|
|
|
::
|
|
|
|
hello kodjo
|
|
hello my friend
|
|
|
|
They will produce the strings "hello kodjo" or "hello my friend"
|
|
|
|
About versioning
|
|
""""""""""""""""
|
|
As I said previously, I mimic how git_ versions its objects.
|
|
|
|
::
|
|
|
|
Obj v0 : parents = []
|
|
user name = <a name>
|
|
modification date = <a date>
|
|
digest = xxxxx
|
|
|
|
Obj v1: parents = [xxxxx]
|
|
user name = <a name>
|
|
modification date = <a date>
|
|
digest = yyyyy
|
|
|
|
Obj v1: parents = [yyyyy]
|
|
user name = <a name>
|
|
modification date = <a date>
|
|
digest = zzzzz
|
|
|
|
and so on...
|
|
|
|
I always keep a reference to the last version of the object, so I can navigate through
|
|
the versions using the :code:`parents` attribute of the object
|
|
|
|
In git_, there are basically two types of objects :
|
|
|
|
- **content** (file content, or directory structure)
|
|
- **reference** to content (commit or tags)
|
|
|
|
The hash a **content** only depends on it, while the hash of a **reference** also depends
|
|
on the user name, the modification date and the parents. In both cases, the hash is
|
|
computed on the whole object. So the hash can also be used to check the integrity
|
|
of an object.
|
|
|
|
For my objects, I need to decide how I compute the hash.
|
|
|
|
**Concepts** have history, if I decide to include the history in the hash,
|
|
as the modification date is :code:`datetime.now()`, a new version will be created
|
|
even if the **Concept** has not changed. If I don't include it, the integrity of the
|
|
what is saved is no longer guaranteed.
|
|
|
|
I choose to value identity over integrity. The hash code of the **Concepts** does not depend
|
|
on his history. We will see what the future will say about this.
|
|
|
|
2019-11-01
|
|
**********
|
|
|
|
Inspired by CodinGames
|
|
""""""""""""""""""""""
|
|
|
|
|
|
.. _codingame: https://www.codingame.com/home
|
|
|
|
I am trying to teach my little kid how to code. He is 12 years old and it was his very
|
|
first time.
|
|
|
|
Rather than trying a standard formal approach, we went on the codingame_ web site. There
|
|
are some pro and cons to use this platform, specially for the very beginners, but
|
|
I like the visual output of the programs. It's really like coding a game !
|
|
|
|
What I haven't noticed previously, is that (at least for the first programs), the solution
|
|
is given in human language.
|
|
|
|
For example, for the exercise called "The descent" you will find
|
|
|
|
::
|
|
|
|
For each round of play :
|
|
Reset the variables containing the index of the highest mountain and its height to 0
|
|
For each mountain index (from 0 to 7 included) :
|
|
Read the height of the mountain (variable 'mountainH') from stdin
|
|
If it's higher than the highest known mountain, save its index and height
|
|
Returns the index of the highest mountain on stdout
|
|
|
|
It will be great if Sheerka is able to produce some code from these instructions :-)
|
|
|
|
Some words on data persistence
|
|
"""""""""""""""""""""""""""""""""""""""""
|
|
As I previously said (or not), the main difference between Sheerka and other languages,
|
|
is that Sheerka has a memory of its (her ? :-) previous interactions with the users.
|
|
|
|
The **Concepts**, as well as the **Events** or the **Rules** are persisted. Because of
|
|
that, I think that the more Sheerka is used, the more easier it will be to use it.
|
|
|
|
So my first focus was to decide which database to use.
|
|
|
|
There are tons of different databases already on the market. Unfortunately for me, I'm not
|
|
a database expert. But, I already know that I was not looking for a traditional
|
|
relational database (SGDB) as the structure will evolve and I didn't want to spend
|
|
my time on redesigning the schemas and the constraints.
|
|
|
|
As I was learning Python, it could have been a good idea to also start looking at an
|
|
already existing NoSql database. I started to look at MongoDB, but I got lazy. I knew that
|
|
the top feature that I needed was that management of the history (the way git does it),
|
|
and it was not provided by Mongo, or I didn't notice it in my first readings on the subject.
|
|
|
|
So I decided to design and implement my own database.
|
|
|
|
|
|
SheerkaDataProvider (sdp)
|
|
"""""""""""""""""""""""""
|
|
Not I great name, I confess. But who care ?
|
|
|
|
What are the main design constraints?
|
|
|
|
::
|
|
|
|
1. No adherence with the filesystem.
|
|
We must not care about where the data are stored.
|
|
The first implementation will be file based, but it has to be extensible.
|
|
The final target will be to have a decentralized persistence system
|
|
2. CRUD operations are designed according to my needs
|
|
I don't want standard CRUD operations that I will have tweak.
|
|
The direct consequence is that sdp won't fit any other purpose
|
|
3. History management for State and other objects for free.
|
|
|
|
|
|
sdp, like many modern database systems, is a dictionary. A big list of key-value pairs.
|
|
The key is a string, the value can be almost anything. Actually, for my needs, I guess
|
|
that I only need strings, numbers and list (of strings and numbers :-)
|
|
|
|
Json also provide, true, false and null. So I guess that I will also need them.
|
|
|
|
I need at least one level of categorization. That means that my objects can be grouped.
|
|
The basic signature to add a new element :code:`add(entry, obj)`.
|
|
|
|
with
|
|
|
|
::
|
|
|
|
entry : is the group / category where I want to put the object
|
|
object : object to persist
|
|
|
|
With :code:`add("All_Concepts", "foo")` the database, let's call it **State** once for all, will be updated like this:
|
|
|
|
.. code-block:: json
|
|
|
|
{"All_Concepts" : "foo"}
|
|
|
|
If I want to have another entry, I don't want to care about what was previously done. I
|
|
need the second call :code:`add("All_Concepts", "bar")` to produce
|
|
|
|
.. code-block:: json
|
|
|
|
{"All_Concepts" : ["foo", "bar"]}
|
|
|
|
|
|
So we are no longer in the usual way of implementing a CRUD.
|
|
|
|
|
|
|
|
2019-11-06
|
|
**********
|
|
|
|
Input processing
|
|
"""""""""""""""""
|
|
The basic processing flow should be
|
|
|
|
::
|
|
|
|
1. parsers
|
|
2. evaluators
|
|
3. printers
|
|
|
|
So, for each new input, all known parsers will try to recognize the input. Each parser will
|
|
return a triplet of :code:`(status, concept found (or node found), text message)`
|
|
|
|
This list of triplet is given to the evaluators. In the same way, there should be multiple
|
|
types of evaluators. There will be the rules that will be introduced later.
|
|
|
|
All evaluators will provide a list (a guess it will be triplets as well) to the printers.
|
|
|
|
Python processing
|
|
"""""""""""""""""
|
|
Sheerka natively understand Python. So it will be able to execute Python code.
|
|
I will manage later on the issues caused by the different version of Python, or the fact
|
|
that some external modules must remain isolated (maybe using virtualenv)
|
|
|
|
My first problem is to correctly implement the :code:`eval / exec` function.
|
|
|
|
I don't know why, by Python has two similar function to do the same thing. One must use
|
|
eval to evaluate expression, or use exec to execute code. There must be an explanation but,
|
|
as for know, it seems to be a complication for nothing.
|
|
|
|
The next issue that I will have to tackle is that Sheerka is not a REPL. After the execution
|
|
of the input, the system stops. Nothing is kept in memory (eg RAM).
|
|
The whole idea is to make Sheerka 'remember', even something that happened a long time ago.
|
|
So I should find a way to 'freeze the time'
|
|
|
|
To better explain what I have in mind. let's say that I want to pretty print an object
|
|
|
|
.. code-block:: python
|
|
|
|
import pprint
|
|
pp = pprint.PrettyPrinter(indent=4)
|
|
pp.pprint(stuff)
|
|
|
|
I need three line in oder to be able to pretty print. I will first try by dumping the
|
|
globals(), using pickle and load it back whenever needed.
|
|
|
|
If it does not work as expected, I can find a way to save the commands a exec everything
|
|
when needed. (first time, I exec import... second time I exec import + pp == and the last
|
|
time I exec the three statements).
|
|
|
|
2019-11-07
|
|
**********
|
|
|
|
Back on data persistence
|
|
"""""""""""""""""""""""""
|
|
Last time, I talked on how to add new entries in the **State**. I only need the name of
|
|
the category, on the object. If I add several objects under the same entry,
|
|
they don't override each other, they are kept as a list.
|
|
|
|
.. code-block:: python
|
|
|
|
add("All_Concepts", "foo")
|
|
add("All_Concepts", "bar")
|
|
|
|
will produce something like
|
|
|
|
.. code-block:: json
|
|
|
|
{"All_Concepts" : ["foo", "bar"]}
|
|
|
|
The reason behind this chose is that, in the human world, the same name can refer to
|
|
several concepts. The first obvious cases are the synonyms. Same word, but different
|
|
meaning. There are also some other case where the meaning of the world depend on the context.
|
|
Rather than forcing the user to spend some time to find another way to express the concept,
|
|
(as the name already exists), I prefer allow the storage under the same key.
|
|
The choice of the correct item to use in the list will be done on execution.
|
|
|
|
I also need sdp to manage the key of my object. So 'entry' will be used to group object,
|
|
and the key will help to quick access to it.
|
|
|
|
I don't want the signature :code:`add(entry, key, object)` because sometimes there is a key,
|
|
but keys are not mandatory. So I keep the signature :code:`add(entry, object)`
|
|
|
|
To manage the key, the object either is a key/value entry :code:`{key: value}` (Python dict) or
|
|
has an attribute :code:`key`, or has a method :code:`get_key()`
|
|
|
|
For example **Concepts** have a method :code:`get_key()`, so if the key of 'concept' is "foo",
|
|
the code
|
|
|
|
.. code-block:: python
|
|
|
|
add("All_Concepts", concept)
|
|
|
|
will produce something like
|
|
|
|
.. code-block:: python
|
|
|
|
{"All_Concepts" : {"foo" : concept}}
|
|
|
|
If I add another concept (concept2) which has tke key "bar", I will have
|
|
|
|
.. code-block:: python
|
|
|
|
{"All_Concepts" : {"foo" : concept, "bar": concept2}}
|
|
|
|
and so on..
|
|
|
|
So under the 'All_Concepts' group, I have a quick access to the concept "foo"
|
|
|
|
Note that, if for some reason, I end up with several concepts this the same key, they will
|
|
be just stack as list. I don't loose information.
|
|
|
|
We will talk again about sdp later
|
|
|
|
Status
|
|
""""""
|
|
As of today, I have a first implementation of several main functionalities of the system
|
|
|
|
|
|
1. I have a good implementation of sdp
|
|
* When I say good, I talk about the coverage of the functionalities, not the efficiency of the code
|
|
* I can add object to the state
|
|
* The objects can be saved as reference (will be explained later)
|
|
* I manage events
|
|
* I manage history
|
|
* I manage several types of serialisation
|
|
2. I have two parsers
|
|
* DefaultParser : to detect sheerka specific language (like def concept)
|
|
* PythonParser : to parse Python code.
|
|
* There are called for every new event.
|
|
3. I have a first version of the evaluators
|
|
* These have piece of code that recognize a result and process it
|
|
* The current algo is not finished, but it works for simple cases
|
|
* I can create a new concept
|
|
* I can evaluate simple Python expression
|
|
4. I don't have the printers, but it's ok, I just dump the result of processing
|
|
|
|
so I can type
|
|
|
|
::
|
|
|
|
def concept hello name as "hello" + name
|
|
1 + 1
|
|
sheerka.test()
|
|
|
|
I will now work on how to call an already defined concept. |