Deconstructing typed objects into types and values

Discussions on everything related to the software, electronic, and mechanical components of information systems and instruments.

Deconstructing typed objects into types and values

Postby Natural ChemE on July 13th, 2015, 4:12 am 

Most modern high-level programming languages have typed objects. For example, in
Code: Select all
Method DoSomeMath(int x, int y, double z, bool trueOrFalse)
{
  if (trueOrFalse)
  {
    return x + y;
  }
  else
  {
    return z / (x + y);
  }
}
,
  • x and y are of type int;
  • z is of type double;
  • trueOrFalse is of type bool (which literally just means it's a true or false value; "trueOrFalse" was redundant).

I can't recall seeing this separation of type and value used before, but I'm unsure about why since it seems like a good idea. Has anyone seen it before, or is it stupid for some reason?

We say that these variables are objects of their respective types. But, it's possible to rewrite this code such that type and object are separated. First, we note that operators are actually just method calls, so let's rewrite the code more explicitly:
Code: Select all
DoSomeMath(int x, int y, double z, bool trueOrFalse)
{
  if (trueOrFalse)
  {
    return Integer.Add(x, y);
  }
  else
  {
    return Double.Divide(z, Integer.Add((x, y));
  }
}
. Then we can rewrite such that types and objects are separated:
Code: Select all
DoSomeMath(Object x, Type typeOfX, Object y, Type typeOfY, Object z, Type typeOfZ, Object trueOrFalse, Type typeOftrueOrFalse)
{
  if (typeOftrueOrFalse.Value(trueOrFalse))
  {
    return {resolve operator overload to either typeOfX or typeOfY}.Add(x, y);
  }
  else
  {
    return {resolve operator overload to either typeOfZ or resulting type from the adding of x and y}.Divide(
        z
        , {resolve operator overload to either typeOfX or typeOfY}.Add(x, y)
      );
  }
}


Okay, so this works, but why? The big thing that I'm trying to resolve is the primitive-data-type-vs.-typed-object problem for efficient code.

For example, in a math-heavy programming language that I'm writing up, there will often be many, many method calls in which basic numeric values like doubles are passed. Even in high-level languages like C#, such primitives are considered value types rather than full classes (reference types).

Here I'm basically saying that all objects are treated like data types and that their type information is passed as a separate parameter. In many methods in which argument type is specified, e.g. as in the code above, the arguments can be removed by optimization since their constant value is always known. This also allows variable return types to be determined before execution in some cases, e.g. even if a method is allowed to return multiple types, if the optimizer can show that the method always returns the same return type despite object values, then return type is known and can be optimized away.
Natural ChemE
Forum Moderator
 
Posts: 2754
Joined: 28 Dec 2009


Re: Deconstructing typed objects into type and objects

Postby Natural ChemE on July 13th, 2015, 4:45 am 

A big reason that I want to do this is as a generalization of multiple dispatch, which Julia seems proud of (since it's the first two features they list on their home page).

Basically multiple dispatch is like method overloading, e.g. if there're methods DoSomething(int x) and DoSomething(string text), then the method selected depends on the type of the argument when DoSomething is called. If the type is known in advance, then the particular method can be selected during compilation rather than run time for greater performance.

I'd like to do it like this:
  1. User writes:
      Code: Select all
      DoSomething(int x) { ... }
      DoSomething string(string text) { ... }
  2. System interprets as:
      Code: Select all
      DoSomething(Type typeOfArg1, object valueOfArg1)
      {
        switch (typeOfArg1)
        {
          case int:
            [ perform the ... user specified for DoSomething(int x) { ... } where x = valueOfArg1 ]
            break;
          case string:
            [ perform the ... user specified for DoSomething(string text) { ... } where text = valueOfArg1 ]
          break;
        }
      }
  3. When calls to DoSomething() are optimized, typeOfArg1 may or may not be known.
    1. If known, optimization reduces switch(typeOfArg1), effectively reducing down to specific user-specified method.
    2. If unknown, optimization leaves as-is, essentially leaving a multiple-dispatcher.

However I'd say that this generalizes multiple dispatch because there's no fundamental distinction between method selection and method in the general case. For example, we're free to create a method:
    Code: Select all
    DoSomething(Type typeOfArg1, object valueOfArg1)
    {
      switch (typeOfArg1)
      {
        case int:
          [ perform the ... user specified for DoSomething(int x) { ... } where x = valueOfArg1 ]
          break;
        case string:
          if (Integer.TryParse(string, valueOfArg1, out parsedValueOfArg1)
          {
            [ perform the ... user specified for DoSomething(int x) { ... } where x = parsedValueOfArg1]
          }
          else
          {
            [ perform the ... user specified for DoSomething(string text) { ... } where text = valueOfArg1 ]
          }
        break;
      }
    }
, or something like that.

The above example isn't a particularly practical case, though as noted in What's a number?, I'm trying to do optimizations for math-intensive programs. I'd like to be able to blur the line between type and value in dispatch. For example, a type could be an Integer, and then the selected method handling an int, long, BigInt, etc., is based on the potential value range of the integer (which is known to the system due to programmer specifications, discussed here).

A further advantage is that method linking can be partial! For example, if
    Code: Select all
    DoSomething(int x, int y, int z) { ... }
    DoSomething(int x, int y, double z) { ... }
    DoSomething(int x, double y, int z) { ... }
    DoSomething(double x, int y, int z) { ... }
    DoSomething(int x, double y, double z) { ... }
    DoSomething(double x, double y, int z) { ... }
    DoSomething(double x, double y, double z) { ... }
, which might easily arise from auto-generated code if users employ an interpreter which emits specific implementations of user-generated code
    Code: Select all
    DoSomething(Number x, Number y, Number  z) { ... }
, and only the types of x and y are known (say both int), then the dispatch can be reduced to being based on the type of z, e.g.
    Code: Select all
    DoSomething(object valueOfArg1, object valueOfArg2, Type typeOfArg3, object valueOfArg3)
    {
      switch (typeOfArg3)
      {
        case int:
          [ perform DoSomething(int x, int y, int z) { ... } ]
          break;
        case double:
          [ perform DoSomething(int x, int y, double z) { ... } ]
        break;
      }
    }
, avoiding two thirds of the cost of dynamic dispatch.

I'm conflicted due to a concern about code explosion. This is, if I create a new optimized version of DoSomething(...) for each call to it, then the code size blows up. Even if the computer's storage is sufficient, this would incur the same sorts of performance hits associated with excessive inlining. If I design the optimizer to aggressively assert common subexpression elimination, then the various optimized redundancies would be found and linked together as a single copy, but that'd be costly. Rather, I would need to program the optimizer to have foreknowlege of this specific optimization approach and consult it for prior optimized (type-specific) versions of DoSomething before writing a new one. This would be favoritism, and while still better and more general than what other systems use, it strikes me as gross.

I'd prefer a general, optimization-implementation-agnostic approach. But if such an approach would come at the loss of generalization, it'd be yet grosser than the non-agnostic implementation of the general approach.
Natural ChemE
Forum Moderator
 
Posts: 2754
Joined: 28 Dec 2009


Re: Deconstructing typed objects into type and objects

Postby Natural ChemE on July 13th, 2015, 4:58 am 

Just to mention it, my other reason for wanting this approach is that the system/interpreter/whatever-you-call-it won't need to wrap the types of the underlying system.

For motivation, I'm writing my programming platform in C#. All objects within my system were a type I created, S_Object ("System Object"), but as I reduced it to implementation in partially compiled versions, I had a lot of C# objects being wrapped to pass through the non-compiled parts. By allowing objects to be weird, unidentified entities, now the uncompiled parts of the program are free pass around C# objects without wrapping them, trusting that their ultimate destination will have some sort of handler for them (which may simply be passing them to a compiled part of the program in which C# objects are required).

For example, say I compile a method Function AddOne(int x) { return x + 1; } to the underlying language (C#). Calls to that method would then need to pass a C# int, not a system int. But, since the syntax for a Method didn't allow non-system objects, that wasn't possible. Now it is because Methods allow any object to be passed; even non-system ones.

While I want this system to be implementable in other languages, I suspect that the same concept can apply: whatever language it's running on, it can pass that language's native objects as its own value types, passing system-specific types within the system as handlers.
Natural ChemE
Forum Moderator
 
Posts: 2754
Joined: 28 Dec 2009



Return to Computers

Who is online

Users browsing this forum: No registered users and 3 guests