Skip to content

Json Serialization Refactor (Removing Newtonsoft & Fixing NotMapped being serialized) #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 6, 2025

Conversation

magiccodingman
Copy link
Owner

Goals

1.) Centralize Json Serialization for better control.
2.) Remove Newtonsoft from the serialization process.
3.) Refactor serialization process for better performance.
5.) Specific refactors to allow AOT use due to AOT system reflection stripping mechanisms.
6.) Resolving the "MagicNotMapped" attribute not being respected when serialized.

Centralizing Json Serialization

A new class called MagicSerializationHelper was created to be used internally within the project. Whenever any serialization occurs via any of the following, always use the new magic helpers which wrap the following functionality:

public static string SerializeObject<T>(T value, MagicJsonSerializationSettings? settings = null)

public static T? DeserializeObject<T>(string json)

public static void PopulateObject<T>(T source, T target)

This is to centralize the serialization use. But it additionally enforces that when you utilize MagicSerializationHelper.SerializeObject that it will respect the MagicNotMapped attributes during the serialization process. So it's very important that when the library need to communicate to JS, we utilize these centralized methods. Additionally there's future capabilities I'm considering in which centralization here will make future refactoring easier.

After this was completed, all instances of Newtonsoft was removed except fro the newly created MagicSerializationHelper class.

Removing Newtonsoft

The largest use case for Newtonsoft was the flexibility provided by "Newtonsoft.Json.Linq" which was primary utilized in a critical method called, "AddConditionInternal" inside of "IndexDbManager". Utilizing "System.Text.Json.Nodes", I was able to replace the use of the Newtonsoft LINQ library. Though the Json Nodes is less flexible and capable, it did not have any negative results and had plenty of functionality for the library use case.

Once this was resolved, with only the MagicSerializationHelper using Newtonsoft, I was able to officially use only, "System.Text.Json" without any use of Newtonsoft. I have also gone ahead and removed Newtonsoft from the project referenced package.

Refactored Serialization

It's important for not just performance, but also AOT to limit system reflections and cache the results when necessary. The helper class, "ExpandoToTypeConverter" was created to efficiently handle what Newtonsoft LINQ was being heavily utilized with. In which the expando type converterproperly converts and caches what was mapped. This way future mappings won't require rerunning any discovery and instead efficiently utilizes cache after the first run.

This replaced within the AddRangeAsync class the commonly utilized line here:

myClass = JsonConvert.DeserializeObject<T>(JsonConvert.SerializeObject(processedRecord));

Which was made to combat a variety of issues, but it required serializing the object to disconnect it from any entity type because the processed record was guaranteed to be type of T, then we had to re serialize it back to type of T. This was incredibly inefficient.

Now the code utilizes:

myClass = ExpandoToTypeConverter<T>.ConvertExpando(expando);

Which within the converter will efficiently handle concrete class while using no system reflections for AOT and efficiently pre computing properties while caching the results for future re-use. Now to remove the need of system reflections while also keeping this incredibly performant, the code only works on concrete classes. Which is why within the code it detects if the class provided is concrete or not. If the class is not concrete, the code will recognize this and then run the following as you'll note in the code:

var instance = MagicSerializationHelper.DeserializeObject<T>(MagicSerializationHelper.SerializeObject(expando))!;

// Check if we can cache this as a known type
if (!NonConcreteTypeCache.ContainsKey(type))
{
    NonConcreteTypeCache[type] = true;

    // Dynamically compute property setters for this type
    PrecomputePropertySetters(type);
}

This means non concrete classes (which are very uncommon for this case) will default back to the original catch all serialize then deserialize method. But it then takes the results, grabs the keys, and then goes back through the pre compute setter method to store what it has learned. This way even if it has to default to a less efficient method of grabbing the instance, it only has to do so the very first time and then again it will cache the mapping to prevent this from occurring again

Resolving MagicNotMapped being Serialized

Within the, "MagicSerializationHelper" the SerializerObject method basically uses the standard System.Text.Json serialization process but note the magic json serialization settings and option resolver added.

public static string SerializeObject<T>(T value, MagicJsonSerializationSettings? settings = null)
{
    if (settings == null)
        settings = new MagicJsonSerializationSettings();

    if (value == null)
        throw new ArgumentNullException(nameof(value), "Object cannot be null");

    var options = settings.GetOptionsWithResolver<T>(); // Ensure the correct resolver is applied

    return JsonSerializer.Serialize(value, options);
}

Which the MagicJsonSerializationSettings is the the new serialization settings used when serializing or deserializing instead of the default system.text.json serialization settings. As we require an override for manual serialization using:

 public class MagicJsonSerializationSettings

I allowed an override to add the use of camel case as you can see. But the real heavy hitter in this resolution is the new MagicContractResolver:

internal class MagicContractResolver<T> : JsonConverter<T>

The contract resolver just like all the other additions uses caching after the first run to prevent rediscovery of mappings or attributes on already discovered properties.

The new resolver hardcodes the knowledge of the "MagicNotMappedAttribute" as a property to look out for and ignore and if found when serializing during the first discovery process or in the previously discovered and cached properties, it will again know to ignore the attribute now.

This resolves the serialization issue with "MagicNotMappedAttribute" not being respected while also significantly increasing the performance of all serialization moving forward. Additionally this leaves room to easily scale into more specialized serialization tactics in the future if desired.

Large performance Improvements

With the limited use of System Reflections, removal of unnecessary serialization, and removal of Newtonsoft, this will be a massive performance boost to the MagicIndexDB library. Additionally since the MagicSerializationHelper.SerializeObject and many other methods now utilize caching strategies to remember mappings. Not only is the initial discovery and mappings faster, but once mappings are cached, consecutive runs of serialization or use of that model will be significantly faster.

…jects.

Added new custom serialization process to centralize serialization with wrappers and to enforce the not mapped attribute to be respected during serialization.

Created multiple caching dictionaries. This allows after the first discovery of an attribute, mapped location, or more, it will be cached and not require rediscovery on follow up use.

Removed all use case of Newtonsoft JSON serialization.
…fix during serialization. This was fixing the issue.
@magiccodingman magiccodingman self-assigned this Feb 18, 2025
@magiccodingman magiccodingman added bug Something isn't working documentation Improvements or additions to documentation Feature Request labels Feb 18, 2025
@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 19, 2025

Wow! This do solve many problem! Hmm... But in my opinion the serializer could do more.

Let's take a look at the whole process, for example in AddAsync:

  1. we convert the record to a dictionary
  2. map the dictionary keys
  3. and pass the dictionary to JavaScript layer (blazor will do serialize/deserialize again)

We import many complex things in the first step, including reflection, ExpandoObject and duplicate serialize/deserialize, just to support our column names. So why don't we just do like:

  1. serialize the record with our serializer, which deals with all the attributes we provide
  2. pass the json string to JavaScript layer and deserialize in JavaScript (maybe we could use stream or something else to avoid serializing the string again)

In this case we won't be bother by any behavior of blazor. We don't need the dictionary middle value. Users can easily know (or even completely customize) what will be actually written in JavaScript. The performance will also be much better.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 19, 2025

By the way I can't run the test project. It says Cannot bind to the target method because its signature is not compatible with that of the delegate type.

Anyway I will first accept this to avoid blocking the merge.

yueyinqiu
yueyinqiu previously approved these changes Feb 19, 2025
@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 19, 2025

I have merged this to a branch of mine to try my idea before: https://github.com/yueyinqiu/Magic.IndexedDb/tree/serialize

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 19, 2025

And the error Cannot bind to the target method because its signature is not compatible with that of the delegate type. looks like to happen because it's trying to get a list's ItemProperty's value.

We could add this to solve the problem:

        public override bool CanConvert(Type typeToConvert)
        {
            return typeToConvert.GetCustomAttribute<MagicTableAttribute>() is not null;
        }

But this will make things not work for nested objects. So perhaps we need an attribute for nested objects? Not really sure about this...

@magiccodingman
Copy link
Owner Author

Mind giving me the exact test and scenario you're using? Or maybe I can try out your branch with your tests. This is the issue I ran into in the past with custom serialization. It's very powerful, but there's a lot of things you have to juggle. So it's good you ran into an issue because it's normally an easy fix. Just annoying getting through the first phase of testing

@magiccodingman
Copy link
Owner Author

I'll do some tests with the nested objects as well. I funnily did not even think about nested objects, so I can add that to this code. I know how.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 19, 2025

Mind giving me the exact test and scenario you're using?

Hmm... The error just occurs when running the current Indexdb.Example. It's trying to serialize a list as a normal object by getting property values, and that's the problem.

By the way what do you think of #38 (comment)? It might change the basic the structure of the whole project. So maybe first take a look at it?

@magiccodingman
Copy link
Owner Author

Yea I was reviewing that possibility. Before I merge this branch, I'm going to first fix the more serious serialization issues you caught. Then I'll review the AddAsync process and how a lot of this is working. It has been quite some time since I've dug into the thick of this code. But even though it has been a while, there's alarm bells in my head.

I remember having to be very careful with how a lot of the serialization and process worked. It's extremely finicky. There's a lot of things that don't seem important, but I have vague memories of creating weird paths to resolve edge cases.

And the idea is for sure the direction I'd like to go down. Especially with the stream, though in my opinion, that becomes much more powerful with the call back.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 24, 2025

Hmm... It still looks hard to do this. It's really complex to support some edge cases for us, for example, a record type, whose properties are readonly. And when I was trying to move the unit tests to this PR, it seems that static properties are also added into database. (I haven't checked why it happens, and whether it also happens in the current version.) So a completely customized serializer is not that simple for us. There are too much edge cases.

In this case, I would suggest the original json attributes again to simplify our work.

And about the problem of DTO sharing, I know it would make things more convenient, but personally I don't quite agree with this. After all, the frontend and backend data are not exact the same, which may cause problems when modifying it later.

@magiccodingman
Copy link
Owner Author

Yea this was my issue when doing manual serialization in the past, but I have a fallback. Basically a, "here's all the things we can handle" and then the "here's the overkill to make anything work". I'll try to get a fix in for this soon. Had a bit of a code emergency with my company, so I've not been able to touch this for a few days.

@magiccodingman
Copy link
Owner Author

Once I fix it up, if I can't get it to work in all edge cases, I'll remove that aspect and can the idea until a further date. haha, manual serialization can be very annoying. I've had to write manual serializations like 3-4 times in my career. It has always caused headache lol

@magiccodingman magiccodingman dismissed yueyinqiu’s stale review February 24, 2025 16:31

This branch isn't ready

@magiccodingman
Copy link
Owner Author

@yueyinqiu So, I was able to repeat that error with, "Cannot bind to the target method because its signature is not compatible with that of the delegate type." I actually misunderstood where you were saying the issue was coming from. And it was my mistake missing it because my tests were all passing due to the cache in my browser test being old, thus not actually running all the new code.

The updated "MagicContractResolver" should resolve the issue. Here's documentation on it as I'm using this whole PR as documentation on this pretty significant change to the project:

MagicContractResolver Documentation

Overview

The MagicContractResolver<T> is designed to handle serialization and deserialization of objects ranging from simple primitives to deeply nested, complex structures. It leverages reflection, caching, and comprehensive fallbacks to guarantee serialization reliability in all scenarios.

Core Functionality

1. Simple Type Optimization

The resolver initially performs a type check through the IsSimpleType method:

  • Handles primitive types (int, double, bool, etc.), enums, strings, Guid, Uri, DateTime, TimeSpan, and other commonly used simple types.
  • Serializes these types directly for performance and simplicity.

2. Collection Handling

The resolver identifies collections using the IEnumerable interface (excluding strings to prevent improper character enumeration).

  • Supports generic lists, arrays, dictionaries, and any object implementing IEnumerable.
  • Iterates through collection items, recursively invoking serialization logic on each item, ensuring accurate handling of nested collections.

3. Complex Object Serialization

Complex, user-defined objects are handled through reflection:

  • Iterates over all public, instance properties of the object.
  • Skips properties with the [MagicNotMappedAttribute] to provide flexibility in controlling serialized content.
  • Skips indexer properties to avoid binding errors.
  • Each property is individually serialized, allowing deep recursion into nested structures seamlessly.

Robustness and Error Handling

  • Property-Level Error Handling: Any exceptions thrown while retrieving property values via reflection result in gracefully writing a null instead of interrupting the entire serialization.

  • Complete Serialization Fallback: A comprehensive try-catch block wraps the serialization process. In the event of a catastrophic serialization failure, it falls back gracefully to the default system serializer provided by .NET, ensuring serialization never entirely fails.

Recursive Serialization

  • The resolver inherently supports infinite depth recursion. Nested objects, collections within collections, dictionaries with complex values, and recursive object graphs are serialized without issue.
  • Leveraging built-in .NET serialization recursively ensures accuracy and reliability regardless of complexity.

Performance Optimization via Caching

  • Uses a thread-safe ConcurrentDictionary to cache results of expensive reflection operations, specifically determining whether properties should be ignored.
  • Significantly reduces overhead during repeated serializations of similar object types.

Why It's Bulletproof

The MagicContractResolver is designed for absolute resilience:

  • No Known Failures: Exception handling at multiple levels (property and complete object) ensures serialization always succeeds.
  • Universal Compatibility: Handles every known scenario, from primitive types to complex nested graphs, collections, and dictionaries.
  • Reflection Safeguards: Explicitly skips problematic scenarios such as indexers, preventing known reflection-related serialization errors.

Usage Scenarios

Simple Scenarios

  • Primitive types (e.g., integers, booleans, strings).
  • Standard .NET types (e.g., DateTime, Guid).

Complex Scenarios

  • Nested custom objects containing other complex objects.
  • Collections of complex objects, lists within dictionaries, and vice versa.
  • Objects with selective property serialization via [MagicNotMappedAttribute].
  • Extremely deep recursive object graphs.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 25, 2025

That looks nice!

But I'm not very sure about error handling. I think the property level one is ok, but I don't think it should fall back to the system serializer. Our serializer should be considered completely independent from the system serializer, although we actually use it as the backend, as we don't respect all its features.

If there is an error, we should let the users know that it's not really supported, so that users could adjusted their type structure or ask us for new features. Otherwise there might be some potential problems when deserialization/package-upgrading.

@magiccodingman
Copy link
Owner Author

There we go :D got the try catch removed.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Feb 26, 2025

Hmm... It seems that the ignored property is still here... since we will convert the json string back to the original type. Maybe we should take #38 (comment) into consideration in this PR to totally fix the problems.

I have created #42 to merge my unit tests into this, and then we could know whether it goes as expected.

@magiccodingman
Copy link
Owner Author

Just letting you know I've not forgotten about this! Kind of having a bit of a critical fire situation with my business software. Got to love the sleepless nights fixing your own mistakes lol. But just letting you know I've not mysteriously gone MIA

@magiccodingman
Copy link
Owner Author

@yueyinqiu I'm still putting out fires on my side haha. I know you were saying we could knock out all the items in this PR, but you think this may be good for now to merge if it's stable and then we open another PR for the next issue/s? That way we can keep things moving.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Mar 6, 2025

Well... but currently it haven't fixed any problem...

Hmm... Anyway, I would merge #42 here first, and approve this PR. So you could decide whether to merge this or not.

@magiccodingman
Copy link
Owner Author

Oh I think I know what you're saying. I think I misunderstood. Were you suggesting that the AddAsync is rewritten something like this?

public async Task<TKey> AddAsync<T, TKey>(T record, CancellationToken cancellationToken = default) where T : class
{
    string schemaName = SchemaHelper.GetSchemaName<T>();

    // Process the record (if any additional modifications are needed before serialization)
    object? processedRecord = await ProcessRecordAsync(record, cancellationToken);

    // Apply property mappings BEFORE serialization to JSON
    processedRecord = ManagerHelper.ApplyPropertyMappings(processedRecord);

    // Serialize the mapped record to JSON
    string serializedRecord = CustomJsonSerializer.Serialize(processedRecord);

    // Create a StoreRecord with the serialized JSON string
    StoreRecord<string> recordToSend = new StoreRecord<string>
    {
        DbName = this.DbName,
        StoreName = schemaName,
        Record = serializedRecord
    };

    // Send JSON string directly to JavaScript (Blazor will not double serialize)
    return await CallJsAsync<TKey>(IndexedDbFunctions.ADD_ITEM, cancellationToken, [recordToSend]);
}

@magiccodingman
Copy link
Owner Author

That's pseudo code obviously, but you're suggesting refactoring the AddAsync to better support our serialization and respect them so that we don't have to do all the complex runtime conversions? I think I get what you're saying. Don't know why it wasn't clicking for me, but I see why this new serializer opens this possibility.

@yueyinqiu
Copy link
Collaborator

yueyinqiu commented Mar 6, 2025

yes i mean something like this

and also we dont need extra ApplyPropertyMappings, since it could also be included into our serializer (so there is no conversion between the given object and an expandoobject/dictionary)

@magiccodingman
Copy link
Owner Author

Yes, I see what you're saying now. Sorry about that, I misunderstood initially. So, yes I can for sure do that. It'd actually save an estimated 20% to 40% of memory useage, it'd also be 3x-10x faster as well from the quick calculations I did. So this would be quite a significant improvement in performance.

Let me try some things out real quick because it'll be fun and I need a break from my company stuff anyways. But I am going to make a new PR based on this one. Because this doesn't affect just the AddAsync. After review, it will affect nearly every single primary method utilized. This will have an affect on 15 separate methods.

So we want this to be a totally separate branch as this'll be a very large structural change. But it'll also be 100% worth it. I'll make a PR soon to start a new discussion and organize this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation Feature Request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants