[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MiNT] Shareable/loadable libraries: a preliminary document



Patrice Mandin wrote:
Le Sat, 30 Oct 2004 23:02:30 +0200
Philipp Donzé <philipp.donze@epfl.ch> a écrit:
Even if an executable has only one translation table, you know the size of the TEXT segment, so you know if you are relocating an address
in the TEXT or the DATA segment, so you can align the segments on MMU
pages boundaries.

But this doesn't work if the compiler generates PC relativ addresses. E.g. "move.l data(PC), D0" can't work correctly if the assembler does not know the correct offset to generate. I must admit, I'm new to GCC. But Pure C makes extensive use of PC relativ addressing modes. (=> this gives smaller and faster code)

You're right, I did not think of the case of relative to PC. But I don't
know if gcc generate this type of code or not to access the DATA segment.
You talk about PureC, but I don't know if creating dynamic linked
libraries will be possible with it, so I just stick with binutils+gcc.
And on Linux, the segments are aligned on a page boundary to make the TEXT
segment read-only, so either it is done:
- in the file
- no relative to PC when accessing a different segment

On systems where -fPIC is supported, the compiler assumes that the data segment will always be mapped at a fixed offset from the text segment. Thus an MMU is required so that every process sees the same memory map. This has always been the stumbling block that prevented using gcc's existing PIC support for Atari; we always had to bend over backward and support 68000 in addition to everything else so we couldn't rely on the presence of an MMU or virtual memory.

The fix for this was to add the -mbaserel support to gcc, so that all data references were relative to a base register which was loaded with the address of the data+bss segment. Unfortunately I don't think the baserel code has been maintained through recent gcc releases. It's been a long time since I looked at the gcc/Atari code.

Also, as I recall, the MiNT kernel support for baserel executables was dropped a long time ago.

The hardest part in making the baserel code work for shared libraries was loading the proper offset into the base register when calling into a shared library function. On Unix systems with virtual memory, there's no problem as I noted above because the code can just use a fixed offset from its own text segment. If you can mandate that virtual memory is being used, you can use the same approach. Otherwise you need a search function to locate the data segment for the current shared library.

I believe the correct approach is still to use baserel, even though it sacrifices an address register. This is the approach that SPARC and other RISC machines use (of course, they have more registers to work with).

I'm not sure that you need to define a new EXE format, but it may be helpful to adopt M68K ELF format. That lets you leverage a lot of existing binutils support.

For flexibility, I believe you should adopt the external linker approach - i.e., on a dynamically linked program, the first piece of code executed simply loads ld.so which goes through and links all the dependent shared libraries. Again, you can adopt most of the dynamic linking support already in GNU libc if you follow this route, so life should be relatively easy.

Every shared library should export a Procedure Linkage Table which external modules will use to access the library's entry points. To support lazy resolution, the code pointed to by the PLT will initially all point to a function that invokes ld.so. This function then searches the dynamic link table to resolve a symbol to its actual entry point.

For sharing compatibility on non-MMU systems, I suggest that this entry point actually be a search function that locates the data segment for the shared library. The binutils linker will need to generate these stubs. All calls within a library can execute directly, all calls from outside the library must perform a segment swap on the base register.

When compiling PIC, code never directly accesses data, it always goes through a Global Offset Table. One of the jobs of ld.so is to populate the GOT with the actual address of every data variable.

There will be a noticable decrease in performance for dynamic code vs static code because of this indirection, but it needs to be done, even with an MMU.

The linker must generate the GOT and include it in the object files. Most of the code to do this already exists for ELF platforms.

Also, ld.so should be a baserel program, but I don't think it needs to be fully PIC.
--
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support