-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ServiceProvider.GetServices() Returns Results Based on Last GetKeyedServices() Call #111795
Comments
Tagging subscribers to this area: @dotnet/area-extensions-dependencyinjection |
The race condition comes from the ThreadPool.UnsafeQueueUserWorkItem call in DynamicServiceProviderEngine.RealizeService: Lines 30 to 48 in c8acea2
The bug is that, when CallSiteFactory.TryCreateEnumerable creates an IEnumerableCallSite, it does not set ServiceCallSite.Key, which remains null: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceLookup/CallSiteFactory.cs Line 357 in c8acea2
Lines 16 to 22 in c8acea2
runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceLookup/ServiceCallSite.cs Line 23 in c8acea2
ServiceCallSite.Key being null then causes ServiceProvider.ReplaceServiceAccessor to update the wrong item in _serviceAccessors, assigning the new keyed service accessor to an unkeyed service: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceProvider.cs Lines 226 to 233 in c8acea2
In contrast, CallSiteFactory.TryCreateExact sets ServiceCallSite.Key before it stores the ServiceCallSite to the ConcurrentDictionary: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceLookup/CallSiteFactory.cs Lines 402 to 404 in c8acea2
CallSiteFactory.TryCreateOpenGeneric doesn't set ServiceCallSite.Key either. That is also a bug and can be tested as follows: using System;
using System.Threading.Tasks;
using Microsoft.Extensions.DependencyInjection;
var services = new ServiceCollection();
services.AddTransient(typeof(IService<>), typeof(UnkeyedService<>));
services.AddKeyedTransient(typeof(IService<>), "1", typeof(KeyedService1<>));
var provider = services.BuildServiceProvider();
Console.WriteLine(provider.GetRequiredService<IService<int>>()); // Output: "UnkeyedService`1[System.Int32]"
Console.WriteLine(provider.GetRequiredKeyedService<IService<int>>("1")); // Output: "KeyedService1`1[System.Int32]"
Console.WriteLine(provider.GetRequiredKeyedService<IService<int>>("1")); // Output: "KeyedService1`1[System.Int32]"
await Task.Delay(100);
Console.WriteLine(provider.GetRequiredService<IService<int>>()); // Expected: "UnkeyedService`1[System.Int32]", Actual: "KeyedService1`1[System.Int32]"
interface IService<T>
{
}
class UnkeyedService<T> : IService<T>
{
}
class KeyedService1<T> : IService<T>
{
} |
So to fix this, I think the "dirty fix for ReplaceServiceAccessor" to set ServiceCallSite.Key must be implemented in CallSiteFactory.TryCreateEnumerable here: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceLookup/CallSiteFactory.cs Line 357 in c8acea2
and in CallSiteFactory.TryCreateOpenGeneric here: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceLookup/CallSiteFactory.cs Line 452 in c8acea2
Alternatively, it would be possible to change CallSiteFactory.CreateConstructorCallSite instead of CallSiteFactory.TryCreateOpenGeneric. That change would however be larger, as CallSiteFactory.CreateConstructorCallSite has three For automated testing, I don't see a clean way to wait until the work item enqueued by DynamicServiceProviderEngine.RealizeService has finished executing. AFAICT, ThreadPool.PendingWorkItemCount is decremented already when the work item is dequeued and has not yet started executing, so the test cannot just poll until ThreadPool.PendingWorkItemCount goes to zero. ThreadPool.CompletedWorkItemCount incrementing is a signal, but if the process is running other tests in parallel, then the increment could be caused by some unrelated work item. |
I presume a non-"dirty" fix would be to make ServiceCallSite.Key a readonly property and initialise it in constructors. The key would be a required parameter of the constructors (either as is or as part of a ServiceIdentifier parameter), and it would not be possible to forget to set it. However, if the fix needs to be backported to .NET 8.0, then I expect a smaller change would be more easily approved. |
If you add the following to the start of the program, then the bug does not occur, but DI containers may become slower: AppContext.SetSwitch("Microsoft.Extensions.DependencyInjection.DisableDynamicEngine", true); IIUC, Native AOT compilation likewise disables the dynamic engine and thus avoids the bug: runtime/src/libraries/Microsoft.Extensions.DependencyInjection/src/ServiceProvider.cs Lines 249 to 261 in 5535e31
Line 9 in 5535e31
The existence of this workaround makes the fix less important to backport to .NET 8.0 and 9.0. |
Given that some of Microsoft's own extension packages use keyed services, it probably would be a good idea to disable the dynamic engine by default. Because the current state is basically a set-up to have everyone that has a mildly complicated project, sent through a debugging hell-hole before finally (hopefully...) by some miraculous event arriving at this thread here -- to learn about the busted functionality in the dynamic engine and that the fix for them is basically "upgrade to .NET 10 or disable it; here's how." |
@KalleOlaviNiemitalo thanks for your analysis. I have not prototyped the suggestion:
but am flagging this as "Help Wanted" if someone wants to pick this up and ideally add tests that would verify. |
Upgrading only the Microsoft.Extensions.DependencyInjection package should suffice, if it remains compatible with the older .NET Runtime. On the Lines 101 to 102 in ca8698d
Line 4 in ca8698d
However, it could indeed be difficult for developers suffering from this bug to discover that there is a fix. |
Hi @steveharter, as mentioned in previous comments, the problem is that the The fix proposed by @KalleOlaviNiemitalo to make the If all that sounds good, I'd be happy to help and implement the fix as well as cover these cases with new tests. I see the issue has the |
Re automated testing of the fix:
Some tests use RemoteExecutor to run code in a separate process. If the test did that, I think it could then rely on polling ThreadPool.CompletedWorkItemCount. Or perhaps it could limit the thread pool to one worker thread only (ThreadPool.SetMaxThreads), post another work item after DynamicServiceProviderEngine.RealizeService has posted its, and wait for the second work item to finish. I don't know how reliably the thread pool stays within the specified max thread limit, though. |
Done. Thanks. |
@steveharter, while it's relatively easy to fix the problem, covering these cases with unit tests is tricky since the DI cache update happens inside the DynamicServiceProviderEngine method in a background task. One way to do this is to simply use Task.Delay, but this might not be completely reliable. I've created tests verifying this behavior for both Generic and Enumerable cases, ran them with different delay (with the current implementation, i.e. without applying the fix) to see if we could get some acceptable rate and it seems that 10000 ms is pretty reliable (though I assume the environment can affect the results):
Please let me know if this is acceptable, and we can use the 10000 ms delay to wait for the cache to be updated. If this is still not reliable enough, I can see several ways to do it reliably, but so far I haven't found a reliable way to do it without any additional code changes (e.g. by using the reflection in tests), so in this case we need to make changes to the DynamicServiceProviderEngine class just for tests (e.g. via a new internal event or by converting background update to an internal Could you please share your opinion on which way seems preferable to you? |
I suggested in the PR to make these tests Outerloop. Can you verify they still repro when running as outerloop. Sample commands to run those:
|
Also I have an open PR that also fixes some caching issues: #113137 |
Thank you, done.
Yes, they reproduce when running outerloop tests as you suggested. |
This fix is not included in the Microsoft.Extensions.DependencyInjection 10.0.0-preview.2.25163.2 package, but it is on the |
So- I take it this then won't be backported and fixed for those users that remain on the 8.0.x series of packages and their closed scope of features, that originally went with .NET 8 ? |
@rjgotten, I don't know whether the fix will be backported to 9.0 and/or 8.0. The fix looks pretty straightforward and has tests, so a backport would have only a low risk of causing new bugs. The public API did not change. I have been commenting here as a user of Microsoft.Extensions.DependencyInjection who has interest in strange bugs. This particular bug has not affected any software developed by me, so I'm not going to ask the maintainers of .NET to backport the fix. If this bug is affecting your application and you want a backport, I guess the maintainers may be interested in why you cannot use the DisableDynamicEngine workaround. |
Oh, I could use that workaround in potential affected apps. It's just a matter of knowing whether I should expect to need to keep it disabled - because as I understood it, having it disabled involves a few performance downsides as well. That's all. |
Description
The
ServiceProvider.GetServices<T>()
method inMicrosoft.Extensions.DependencyInjection
exhibits inconsistent behaviour when used in conjunction with.GetKeyedServices(key)
. Specifically, after calling.GetKeyedServices(key)
. for a particular key more than once and awaiting a slight delay, subsequent calls toGetServices()
return the same result as the lastGetKeyedServices(key)
call, instead of only returning non-keyed services as expected. You can also no longer get the IEnumerable of services of the specified type that have been registered without a key since the method keeps returning the same result as the lastGetKeyedServices(key)
call. The behaviour only occurs if the.GetKeyedServices<T>(key)
returns 1 or more results - it doesn't work with keys with no dependencies registered of the requested type<T>
.Reproduction Steps
Expected behavior
provider.GetServices<MyClass>()
should only return an enumerable containing the implementation defined asservices.AddTransient<MyClass>( ctx => new MyClass { Key = "null" });
Actual behavior
provider.GetServices<MyClass>()
returns the result of the lastprovider.GetKeyedServices<MyClass>(key)
call.Regression?
I tested it only on net8.0 but tried all library versions supporting keyed services.
Known Workarounds
No known workarounds.
Configuration
.NET Version: .NET 8.0
Tested OSs: macOs Sonoma 14.4 (with intel chip), Ubuntu
Architecture: x64
Is it specific to that configuration: No, I presume it will work (or not work, depending on how you look at it) on any configuration.
Other information
If you call
GetKeyedServices(key)
at least twice, it will replace the result of all subsequentGetServices()
with its own result. The replacement doesn't take effect instantly after the 2ndGetKeyedServices(key)
call though, but a few milliseconds after it. When I tested it with versions8.0.0
,8.0.1
and9.0.0
, 10 milliseconds delay was more than enough. With9.0.1
sometimes it takes more than 10 ms for the effect to take place, which is why the example uses 100 ms.You don't necessarily need an asynchronous operation for the bug to occur. You can use synchronous calls to reproduce the bug, for example:
This synchronous code will at some point exit the while loop but the number of iterations it takes is inconsistent (I got as low as 18000 and as high as 49 000), so it's easier to just use
awat Task.Delay(10)
instead of a loop, and still get the same behaviour.The text was updated successfully, but these errors were encountered: