Age | Commit message (Collapse) | Author |
|
Some discussion followed the "Fix synchronization problem in
ByteArray" patch, and that led us to believe that it would be better
to change the JavaConnection class so that it does not rely on being
able to pass ByteArrays around in a way that makes them vulnerable to
the problems that had been seen.
The JavaConnection class accepts a ByteArray in its "write()" method,
and emits a ByteArray when it has read data.
ByteArrays are not the ideal way for the JavaConnection class to
manipulate data and so this patch changes the implementation so that:
a) the "write()" method extracts the byte[] from the supplied
ByteArray and uses these objects, rather than keeping
references to the ByteArray objects (which might lead to
synchronisation issues).
b) the "doRead()" method uses a ByteArrayOutputStream to hold incoming
data, and only constructs a ByteArray out of it when it is ready to
return the data to the application.
These changes make the class more efficient, since in the case of (a),
the need to create temporary ByteArrays is removed, and in (b) the
code no longer creates ByteArrays by iterating through the network
data one byte at a time and appending it to a ByteArray.
It also means that the "synchronized" patch (which would fix the
problem) is no longer necessary, and so that code is reverted.
Test-information:
I patched the code to emulate the situation that would occur when a
buffer is only partially written, and verified that in this case it
correctly re-inserted the unwritten portion of the buffer at the
front of the pending queue.
Ran MLC to various servers, all seems to work OK.
Tested in Harrier, seems to work OK, and does not exhibit problems
that we had seen previously which led us to investigate this issue.
Change-Id: Ifcda547402430c87da45ba7d692518b5af285763
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
We noticed that in certain circumstances a stream of data being sent
to a server was being corrupted.
According to the "onDataWritten" signal, we could see that the data
which Stroke thought it was writing was valid, but by adding debug
code to the JavaConnection class, we could see that what was actually
being sent over the socket was wrong. For example, where
"onDataWritten" would report something like
some text for the server
the actual data being written to the socket (as shown by
toString() of the bytestream) would be something like:
some text fo\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300\200\300
i.e. the length of data is correct, but the last part of the buffer is broken.
We saw this on non-TLS connections, but never on TLS connections.
The reason for this (verified after some debugging) is that the
"ByteData.getData()" method was unsynchronized. In the failing cases,
two threads are calling this method at once. The first one finds that
"dataCopy_" is null, and so new's it and starts filling it with data.
The second thread calls "getData()" before this completes, which means
it sees "dataCopy_" as non-null, and uses that value (even though the
first thread hasn't finished populating it yet).
In the failing scenario, the two threads involved were (1) thread that
was handling the "onDataWritten()" callback (which called "getData()"
to get a String that it sent to a debug stream) and (2) the
JavaConnection code (which wants to write the data to the socket).
It seems likely that the reason this doesn't happen for TLS
connections is that in that case, the JavaConnection object will be
processing a ByteArray object that has been generated via the
SSLEngine (rather than the one which "onDataWritten()" sees, and so
the chance of two threads both calling "getData()" is reduced.
(I have not followed the TLS code path thoroughly to verify this).
So this change makes any method in ByteArray that touches "dataCopy_"
be synchronized (as well as hashCode() as suggested by findbugs)
Test-information:
Having inserted some debug code, I could reproduce the "data
corruption" problem reliably.
After adding the "synchronized" directive to "getData()", I could no
longer reproduce the corruption.
Ran MLC with this patch and works with no problems
Change-Id: I02008736a2a8bd44f3702c4526fd67369a3c136a
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
If a TLS connection results in the server choosing an anonymous cipher
suite, then no server certificate will be returned by the server.
This ought not to happen, since XMPP clients are expected only to
propose non-anonymous cipher suites, but it could be that a client is
coded to propose anonymous suites, or that a bug in the server means
that it fails to return a server certificate.
This change updates the ServerIdentityVerifier to make it resilient
against these situations, treating this situation as equivalent to
"certificate presented by server does not verify".
Test-information:
In my testing, I was deliberately using anonymous ciphers and getting
Stroke crashes. After this patch, I don't get Stroke crashes any more
(but the connection fails because the certificate verification fails).
Change-Id: Ia7b9b8dad7a054ff266a78ef33a56157320654c8
|
|
In the PlatformDomainNameResolver class there is a DomainNameAddressQuery
class (accesible via DomainNameResolver->createAddressQuery()) for
performing a DNS lookup on a given domainname. This should have been
returing the set of all HostAddress associated with a given domain, but
instead was only returning a singleton set (or empty if there was no dns).
This patch fixes this by changing the method call from
InetAddress.getByName() to InetAddress.getAllByName().
Test-information:
Tested on top of my MLC Diagnose SRV patch. For 'google.com' we now see a
full list of ip addresses associated with it, rather then just the one.
Change-Id: I6e57c16bb64f76048f16bcff8ee9c1924049a051
|
|
This change moves responsibility for creating the TLSContextFactory
from CoreClient into NetworkFactories, which is in line with the
Swiften implementation.
This means that a caller may now provide his own concrete
TLSContextFactory using code of the form:
NetworkFactories myNetworkFactories;
.
.
myNetworkFactories = new JavaNetworkFactories(eventLoop()) {
@Override
public TLSContextFactory getTLSContextFactory() {
return new MyTLSContextFactory();
}
};
Test-information:
I implemented separate TLSContextFactory and TLSContext classes that
used OpenSSL via JNI) to provide SSL functionality. I was able to
switch to using these with the mechanism that this patch provides.
I also verified that existing code which doesn't try to provide its
own NetworkFactories subclass still works as before (i.e. this patch
doesn't break existing applications).
Change-Id: Ibf07ddbbb4a4d39e4bb30a28be9aa0c43afe005f
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
It was noticed that in certain cases, Stroke got stuck when connecting
to a server over TLS, and the server closed down the connection.
Investigation showed that this appears to be caused by the JSSEContext
code not properly coping with a "close notify" SSL message. What
happens in this case is that the SSLEngine generates a response to the
server's "close notify", and expects this to be sent back over the
network.
The original JSSEContext code saw the "CLOSED" status from the
SSLEngine.unwrap(), but assumed that no more data would be generated
by the engine (which was wrong, because the engine wants to send a close
response back), and so got stuck in a loop.
This patch therefore fixes the JSSEContext code to deal properly when
it sees a CLOSED from unwrap.
After the close has been received, an error will be emitted by
JSSEContext so that the application knows that the SSLEngine can no
longer be used (in practice we have always seen the socket closing,
which generates its own error to the application, but it was
recommended that we should have this check in case a server sends a
close notify and does NOT close the socket as well).
It appears that many servers don't actually send the "close notify",
and just drop the connection, which is (presumably) why we'd not seen
this behaviour before.
Test-information:
Tested by connecting to the aforementioned server. This time, when
the connection times out (and the closenotify is sent), we no longer
see a loop, but the application realises what's happens and attempts
to reconnect.
I have been running with this patch in my copy of MLC for two weeks
and have noticed no difference in behaviour - so far as I can tell the
code is not exercised when talking to M-Link but at any rate the patch
isn't causing anything to break.
Change-Id: Id007c923c510ef1b4ce53192105b00296c65c757
|
|
It is possible to have a null selector if socketchannel open failed so adding a
null check in this patch.
Test-information:
sanity tested on linux by connecting/reconnecting on an xmpp service on linux
Change-Id: Idee180ca4aefd1f743705da674b486dd8acc4922
Reviewer: Nick Hudson <nick.hudson@isode.com>
Reviewer: Kevin Smith <kevin.smith@isode.com>
|
|
I left MLC which is an XMPP Client running overnight and noticed "Too many open files"
error when trying to stop/start xmpp server. On doing an "lsof | grep java", I noticed
a large number of open sockets which was presumably the cause of this error.
After this patch, the lsof command shows a constant number of open sockets.
Test-information:
tested on centos vm by doing "lsof | grep java " wc-l"-open sockets do not increase.
Change-Id: I7ddff78a1efb005177427fda21f1d0b92d8ed7cc
Reviewer: Kevin Smith <kevin.smith@isode.com>
|
|
When investigating problems on Solaris, attention focused on the
JavaConnection class, whose implementation appeared to be non-optimal.
The original implementation had a loop which operated on a
non-blocking socket, and looked something like this:
while (!disconnecting) {
while (something to write) {
write data to socket;
if write failed {
sleep(100); // and try again
}
}
try reading data from socket
if (any data was read) {
process data from socket;
}
sleep(100);
}
Because the socket is non-blocking, the reads/writes return straight
away. This means that even when no data is being transferred, the
loop is executing around ten times a second checking for any data to
read/write.
In one case (Solaris client talking to Solaris server on the same VM)
we were consistently able to get into a state where a write fails to
write any data, so that the "something to write" subloop never exits.
This in turn means that the "try reading data" section of the main
loop is never reached.
Investigation failed to uncover why this problem occurs. The
underlying socket appears to be returning EAGAIN (equivalent to
EWOULDBLOCK), suggesting that the write fails because the client's
local buffer is full. This in turn implies that the server isn't
reading data quickly enough, leading to the buffers on the client side
being full up. But this doesn't explain why, once things have got
into this state, they never free up.
At any rate, it was felt that the implementation above is not ideal
because it is relying on a polling mechanism that is not efficient,
rather than being event driven.
So this change re-implements JavaConnection to use a Selector, which
means that the main loop is event-driven. The new implementation
looks like this
while (!disconnected) {
wait for selector
if (disconnected) {
break;
}
if something to write {
try to write data;
}
if something to read {
try to read data;
}
if still something to write {
sleep(100);
post wake event; // so that next wait completes straight away
}
}
Test-information:
Testing appears to show that the problems we saw on Solaris are no
longer seen with this patch (Solaris tests still fail, but later on,
which appears to be due to a separate problem).
Testing shows that this leads to the thread spending much more time
idle, and only being active when data is being read/written (unlike
the original implementation which was looping ten times a second
regardless of whether any data was being read/written).
Testing using MLC seems to show the new implementation works OK.
I was unable to provoke the "write buffer not completely written"
case, so faked it by making the doWrite() method constrain its maximum
write size to 200 bytes. By doing this I verified that the "leftOver"
section of code was working properly (and incidentally fixed a problem
with the the initial implementation of the patch that had been passing
the wrong parameter to System.arrayCopy).
Change-Id: I5a6191567ba7e9afdb9a26febf00eae72b00f6eb
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
Making it Long allows it to hold an XML-unsignedLong value as well
as null values. Before this patch, it was an int and defaulted to 0.
This was not right as int is too small to hold number of seconds for
last activity time and primitive data types do not allow for null values.
Test-information:
tested using an XMPP client to query last IQ on MUC rooms
Change-Id: I6274403610bd60038fd7c235fad3bc2798f38e19
Reviewer: Kevin Smith <kevin.smith@isode.com>
|
|
Some implementations of SSLEngine (notably Apache harmony used in
Android) never return the FINSHED status from calls to wrap or unwrap,
causing the TLSLayer to never emit its completed signal.
With this change, we treat a return of NOT_HANDSHAKING as equivalent
to FINISHED. The NOT_HANDSHAKING will never happen before handshaking
has finished, because the status during handshaking should always be
NEED_WRAP, NEED_UNWRAP, or NEED_TASK.
Test-information:
Tested with OracleJDK and OpenJDK using Isode M-Link Console to ensure
that the behaviour when negotiating TLS is unchanged (debugging shows
that in these cases it always sees the FINISHED status).
Tested on Android. Without this patch TLS handshakes don't complete;
with the patch, they do.
Change-Id: Ied2989cb2a3458dc6b1d2584dcc6c722d18e1355
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
Direct copy of current signal/slot implementation,
with 4 generic parameters.
Change-Id: I4b2cb37fd134e80e8481950030b6e8721f4f2854
|
|
By default, when a TLS connection is established, the SSLContext will
enable all available ciphersuites. This may not be appropriate in
situations where export restrictions apply and higher grade
ciphersuites are prohibitied.
This change allows a caller to configure a restricted set of
ciphersuites to be used when establishing TLS connections.
Callers use the JSSEContextFactory.setRestrictedCipherSuites() method
to configure a list of ciphersuites. Any ciphersuites which are not
included in the list will be excluded in subsequent TLS connections.
If the JSSEContextFactory.setRestrictedCipherSuites() is never called,
or called with a null parameter, then no restriction will apply.
Test-information:
Validated that by calling the new method to restrict the available
ciphers, TLS connections initiated by Stroke only propose ciphersuites
in the restricted list, and connections fail when the server fails to
find an acceptable cipher.
Change-Id: Id0b4b19553a6f386cda27a71f0172410d899218e
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
This patch adds a new "CAPICertificate" class, which can be used to
configure TLS connections that use a client certificate from a Windows
CAPI keystore, including certificates on smart cards.
The JSSEContext class is updated so that "setClientCertificate()"
checks to see whether the CertificateWithKey object that it's been
given is a PKCS12Certificate or a CAPICertificate, and initializes the
appropriate type of KeyStore.
Note that the default behaviour of the KeyStore returned by SunMSCAPI
when choosing a client certificate for TLS authentication is for it to
choose the "most suitable" certificate it finds.
This "most suitable" certificate may not be the one that the user has
chosen, and in fact various certificates in CAPI are not considered by
SunMSCAPI in this case - for example, certificates issued by CAs who
don't appear in the list of acceptable CAs in the server's
CertificateRequest (RFC5246 7.4.4).
The CAPIKeyManager class provided here allows a caller to override the
default behaviour, and force the use of a specific client certificate
(whether it's "suitable" or not) based on the value specified by the
caller when the CAPICertificate object was created.
This also means that it is possible for a user to specify a particular
certificate and use that, even if SunMSCAPI would have thought a "more
suitable" one was found in CAPI.
Test-information:
Tested that P12 based TLS still works
Tested on Windows that I can specify a "CAPICertificate" which is a
reference to a certificate in the Windows keystore whose private key
is held on a smartcard, and that I am prompted to insert the card (if
necessary() and enter the PIN before the TLS handshake proceeds.
Tested on Windows that I can specify a "CAPICertificate" which is a
reference to an imported P12 file where certificate and key are in
CAPI, and the TLS handshake proceeds without asking me for a PIN
Tested that the "CAPIKeyManager" class is correctly forcing use of the
certificate specified by the user, rather than the one which would be
returned by the default SunMSCAPI implementation.
Tested that I can still use "PKCS12Certificate"s to authenticate
Tested that if I try and use a CAPICertificate on a non-Windows
platform, then I can't authenticate, and get errors emitted from Stroke
complaining of "no such provider: SunMSCAPI"
Change-Id: Iff38e459f60c0806755820f6989c516be37cbf08
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
Two things
- the implementation of JavaTrustManager was attempting to instantiate a
TrustManagerFactory with a hard-coded name of "PKIX", which doesn't
work on Android. So instead of that, we ask for the
TrustManagerFactory's default algorithm - which for the standard JRE
still appears to be "PKIX", but which for Android may be something
else.
- the "hack" which had been in place to force the SSLEngine to
perform a TLS handshake has been removed.
Calling "SSLEngine.beginHandshake()" is not guaranteed to make the
SSLEngine perform the TLS handshake, which it typically only does when
it is told to wrap some data from the client. The earlier version of
JSSEContext provoked this by asking it to send a "<" character, and
then removing the leading "<" from whatever Stroke happened to send next.
It turns out that you can force the handshake to start by telling the
SSLEngine to wrap 0 bytes of data from the client, and so this change
removes the hack, and instead calls "wrapAndSendData()" with an empty
buffer as soon as the SSLEngine has been created.
Test-information:
Ran XMPP client that uses TLS and verified that everything still works
as expected.
Change-Id: Ie08d76bd2f5a743320a59bad62a09c1f215c48d6
Signed-off-by: Nick Hudson <nick.hudson@isode.com>
|
|
If since_ is null, calling clone on it was causing a NUll Pointer Exception.
Adding a check fixes it.
Test-information:
Tested by creating a room using an XMPP client - no exception seen after the fix
Change-Id: I25b151ac8e5b25562b8941eb5532fa9b9ea2de6f
|
|
Change-Id: I49cf4cba01452b291655dfccdc134180270c1ff3
|
|
Change-Id: I862e11dc293ce84e0311f1ad470293e07735aeaf
|
|
Change-Id: Ib02394df2c7bb818c2409b1d6f2fc3ad0d938224
|
|
Change-Id: Id2710c674abc19cdf2b37f97fe53288b86c7f367
|
|
Change-Id: Iab58df1cf6a3b8b9461b71fd3f27476214e07286
|
|
Change-Id: Ie2ec5f94e0a1ee381ab43c09465571de94e64b6f
|
|
Change-Id: I373469fa7a7ba8d5c639d4a1f2d4e07182eeb953
|
|
Change-Id: Ib4717891c591911e68a5b27b7af4e666b6296d48
|
|
Change-Id: Ic7adcf9790429c23b9493ec22324198bfc474b6f
|
|
Change-Id: I0e333781b140a97788e35d401e054a413af0ab76
|
|
Change-Id: Ia1460c62f0bce645248b2412a60a6ad7420ae849
|
|
Change-Id: Iba3aeab8b0140c32f732ce01b1e2da243e7ec141
|
|
Change-Id: If5ef43f2d875f958cd8114b0b3246e6e6f03c95b
|
|
Change-Id: Ic7f627d38318c352c7db057c2347d5e617f4078c
|
|
|
|
|
|
|
|
|
|
Makes ClientOptions do more.
|
|
|
|
|
|
|
|
The method was comparing the classes instead of instances as a result of which
equal JIDs but belonging to different derived class were always unequal.
Test-information:
Tested using an XMPP client which is using objects of class derived from JID
class to compare with raw JID objects
|
|
|
|
In order to make it available to clients.
Test-information:
tested using an XMPP Admin tool to display connection type error
|
|
In one of my testing scenario, socket was not getting closed. This was happening when
an XMPP client was connecting to a domain which was different from the domain in the
jabber ID of the connection user. Moving the close call to a finally block ensures that
socket gets closed in all scenarios.
Test-information:
I created an IM domain j.com on my XMPP Server and then added a user with that
domain (user@j.com). Then I tried connecting to my Primary domain using this
new user. After removing j.com, I could an increase in number of sockets after
every poll (coreclient.connect()) but not after this patch.
|
|
If I leave an Application using Stroke running for a few hours(making periodic
connection attempts), the JVM would throw an Exception saying "Too Many Open Files".
On doing an "lsof -p <pid_of_jvm>", I noticed that there were number of open
sockets in CLOSE_WAIT state and these went up after every attempt to do a
connect on CoreClient object.
Closing of DirContext object fixes this bug and the number of open sockets
does not increase.
Test-information:
Ran MLC and kept on monitoring the result of "lsof -p <pid>". It would not
increase after this patch.
|
|
|
|
|
|
|
|
Also adds a 'make test' target for the Makefile. Set the JUNIT environment variable to point to your jar if it doesn't find it.
|
|
This change ports the MUC Administration related classes from
Swiften to stroke. Also includes the MUC initialisation code in
the CoreClient.
Test-information:
tested the ported unit tests
|
|
This patch ports the classes for Storage, PrivateStorage and PrivateStorage
requests from Swiften to Stroke.
Test-information:
junit test for GetPrivateStorageRequestTest is also ported and tested
|
|
The original implementation of JSSEContext allocated buffers for the
SSL data which started off at a size determined using information from
the SSLSession, and which were allowed to grow to up to ten times
their original size.
When testing with jabber.org it was fairly easy to exceed this size
(e.g. requiring a buffer of ~650K where the maximum value had been
around 160K), meaning that applications would fail.
This change removes the upper limit altogether. Now, the buffer will
grow to whatever is required, so long as free memory is available.
I also renamed "enlargeBuffer" in response to a previous review comment
Test-information:
No longer see problems when talking to jabber.org over
SSL. Instrumentation shows that buffer is growing as expected.
|