PxxxxR0
std::any::base<T>()

New Proposal,

This version:
http://virjacode.com/papers/any_base000.htm
Latest version:
http://virjacode.com/papers/any_base.htm
Author:
TPK Healy <healytpk@vir7ja7code7.com> (Remove all sevens from email address)
Audience:
SG18
Project:
ISO/IEC 14882 Programming Languages — C++, ISO/IEC JTC1/SC22/WG21

Abstract

Add a non-static template member function called base<T>() to the standard library class std::any to determine if the contained value is derived from T, and to get a reference or a pointer to the T sub-object (with the this pointer adjusted in cases of multiple inheritance).

Note: This can be achieved on the four main compilers without an ABI break - GNU g++, LLVM clang, Intel ICX, Microsoft Visual C++.

1. Introduction

We already have the standard library function std::any_cast to check if an std::any object contains a value of a specific type -- however std::any_cast fails if you specify a base class, as demonstrated in the following snippet:

#include <any>
#include <stdexcept>

int main(void)
{
    std::any a = std::runtime_error("Frogs don't have wings");

    std::exception &e = std::any_cast<std::exception&>(a);  // failure
}

This paper proposes to add a method called base to std::any as follows:

class any {
public:
  template<class B> requires is_lvalue_reference_v<B>
  B any::base();  // returns a reference

  template<class B> requires (is_lvalue_reference_v<B> && is_const_v< remove_reference_t<B> >)
  B any::base() const;  // returns a reference to const

  template<class B> requires is_pointer_v<B>
  B any::base();  // returns a pointer

  template<class B> requires (is_pointer_v<B> && is_const_v< remove_pointer_t<B> >)
  B any::base() const;  // returns a pointer to const
  . . .
};

If the contained value is not derived from T, the pointer version returns a null pointer, and the reference version throws std::bad_any_cast.

2. Possible implementations

The easiest way to implement this new feature without causing an ABI break is to make use of the exception handling system, and so this is how it is implemented in the following two code snippets. Realistically though, compiler vendors can implement this new feature more efficently by iterating through the list of base classes which is provided in __vmi_class_type_info or _RTTICompleteObjectLocator or _ThrowInfo.

2.1. GNU g++, LLVM clang, Intel ICX

#include <cxxabi.h>
#include <unwind-cxx.h>

template<class B> requires is_pointer_v<B>
B std::any::base()  // returns a pointer
{
    using Base = std::remove_cvref_t< std::remove_pointer_t<B> >;
    using namespace __cxxabiv1;
    alignas(std::max_align_t) __cxa_refcounted_exception header = {}; // all zeroes
    char *pobj = static_cast<char*>(static_cast<void*>(&header + 1));
    header.referenceCount = 123;
    header.exc.exceptionType       = const_cast<std::type_info*>( this->p_type_info );
    header.exc.exceptionDestructor = [](void*){};
    header.exc.unexpectedHandler   = reinterpret_cast<void(*)(void)>( std::get_terminate() );
    header.exc.terminateHandler    = std::get_terminate();
    __GXX_INIT_PRIMARY_EXCEPTION_CLASS(header.exc.unwindHeader.exception_class);
    header.exc.unwindHeader.exception_cleanup = [](_Unwind_Reason_Code, _Unwind_Exception*){};

    try
    {
        std::rethrow_exception(*static_cast<std::exception_ptr*>(static_cast<void*>(&pobj)) );
    }
    catch(Base &obj)
    {
        std::ptrdiff_t const delta = pobj - (char*)&obj;
        return (Base*)( (char*)this->p_value - delta );
    }
    catch(...){}

    return nullptr;
}

2.2. Microsoft Visual C++

#include <cassert>     // assert
#include <cstdint>     // int32_t
#include <typeinfo>    // type_info
#include <type_traits> // conditional, is_const
#include <utility>     // pair
#include <rttidata.h>  // _RTTICompleteObjectLocator, _RTTIClassHierarchyDescriptor

using std::pair, std::type_info;

struct ThrowInfo {
    std::int32_t attributes,
                 pmfnUnwind,
                 pForwardCompat,
                 pCatchableTypeArray;
};

#ifdef _WIN64
    extern "C" char *GetModuleHandleA(char const*);
    char const *const addr0 = GetModuleHandleA(nullptr);
#else
    char const *const addr0 = nullptr;
#endif

// Microsoft have an algorithm to get the address of a ThrowInfo struct
// from the address of a type_info, as can be observed by disassembling
// the x86_64 machine code for __RTDynamicCast.
// In the following function, I attempt to approximate this algorithm.
// See the comments inside the function.
ThrowInfo const *TypeInfo_To_ThrowInfo(std::type_info const *const pti)
{
    using std::int32_t;

    // Step 1: Moving backwards, find the address of
    // the type_info inside the CatchableType
    int32_t const n0 = (char*)pti - addr0, *p0 = (int32_t*)pti;
    for (;;)
    {
        if ( n0 != *--p0 ) continue;

        //cout << "Found the address of the type_info inside the CatchableType: " << (void*)p0 << endl;

        // Step 2: Calculate the address of the CatchableType
        // (it's 4 bytes behind the address of the type_info);
        char const *const pCatchableType = (char*)(p0 - 1);

        // Step 3: Moving forwards, find the address of the
        // CatchableType inside the CatchableTypeArray
        int32_t const n1 = pCatchableType - addr0, *p1 = (int32_t*)pCatchableType;
        for (;;)
        {
            if ( n1 != *++p1 ) continue;

            //cout << "Found the address of the CatchableType inside the CatchableTypeArray" << (void*)p1 << endl;
            for (;;)
            {
                // Step 4: Calculate the address of the CatchableTypeArray
                // by moving backward until we find a small number
                if ( *--p1 > 16 ) continue;

                //cout << "Found the address of the CatchableTypeArray: " << (void*)p1 << endl;

                // Step 4: Calculate the address of the _ThrowInfo by
                // subtracting 16 bytes (32-Bit) or 32 bytes (64-Bit)
                ThrowInfo *const pthrowinfo = (ThrowInfo*)(p1 - sizeof(void*));
                //cout << "Found the address of the ThrowInfo: " << (void*)pthrowinfo << endl;
                return pthrowinfo;
            }
        }
    }
}

template<class B> requires is_pointer_v<B>
B std::any::base()  // returns a pointer
{
    using Base = std::remove_cvref_t< std::remove_pointer_t<B> >;

    ThrowInfo const *const pthi = TypeInfo_To_ThrowInfo( this->p_type_info );

    struct CatchableTypeArray { int32_t nCatchableTypes, arrayOfCatchableTypes[]; };
    struct CatchableType      { int32_t properties, pType; _PMD thisDisplacement; };

    CatchableType const *p = nullptr;

    CatchableTypeArray const *const pcta = (CatchableTypeArray*)(addr0 + pthi->pCatchableTypeArray);
    for ( unsigned i = 0u; i < pcta->nCatchableTypes; ++i )
    {
        CatchableType const *const pct = (CatchableType *)(addr0 + pcta->arrayOfCatchableTypes[i]);
        std::type_info const *const pti = (std::type_info*)(addr0 + pct->pType);
        if ( &typeid(Base) == pti )
        {
            p = pct;
            break;
        }
    }

    if ( nullptr == p ) return nullptr;

    return (Base*) ((char*)this->p_value + p->thisDisplacement.mdisp);
}

3. Design considerations

The aim is to add this functionality to std::any without causing an ABI break on any extant compilers, specifically paying attention to the Itanium ABI (used by GNU, LLVM, Intel) and the Microsoft ABI (used in Visual C++).

4. Proposed wording

The proposed wording is relative to [N4950].

In subclause __________

1 -- I'll write this later

5. Impact on the standard

This proposal is a library extension. The addition has no effect on any other part of the standard.

6. Impact on existing code

No existing code becomes ill-formed. The behaviour of all existing code is unaffected.

References

Normative References

[N4950]
Thomas Köppe. Working Draft, Standard for Programming Language C++. 10 May 2023. URL: https://wg21.link/n4950