Samba talks on 1st but not 2nd interface. Any ideas?

I have a 1Gbe and a 10Gbe card in 2 computers connected with each. I can SMB on the win 10 box to ubuntu 18.04 on the 1Gb, but not the 10Gb. Iperf3 says 300 MB/s on the 10G link and I get ~300 MB/s on the ISCSI on that link. Not great speed, but let’s get samba talking first, then optimize the link.

I recently had to re-install ubuntu and I’m using the same conf file that was working previously. In samba conf, one of these lines work and the other does not.
interfaces = lo enp5s0
interfaces = lo enp5s0 enp4s0

enp5s0 is the 10G and enp4s0 is the 1G. As far as I can tell, win-10 says the network is “private” and has all the settings of the 1G network. ISCSI has no trouble with the 10G, nor does ssh or winscp or iperf.

The 1G is dhcp with a static lease for ubuntu and just dhcp for win-10. The 10G is static for both hosts on another range like this 1G=192.168.7.0/24 and 10G=192.168.12.0/24 At this moment, the statics have no default gateway and no dns server set, but the dhcp has both fields set correctly.

Any ideas or diagnostic tools/tips are greatly appreciated. I’m at wit’s end over here.

Check the basics first with ping both from the server and from the client. Make sure the firewall is either disabled or open for samba on the second interface. I would check with the iptables -L -vn command. If you’re running IPv6 also check ip6tables

In fact double check your IPv6. If your local name server is returning IPv6 addresses but your interfaces are not set correctly it can block access. Or your samba configuration might not be allowing it. If you use hosts allow at all, then be sure to include the IPv6 ranges too.

And of course check your samba log files. See if there’s anything in there obviously bad looking.

Can you:

  • run testparm
  • disable firewall
  • post your entire smb.conf (obfuscated where necessary)
  • post output of route -n

Thank you for offering help, but please read the OP that answers almost all your questions or makes them red herrings.

The firewall is down, has never been up yet. The dhcp/dns server never sees the 10G subnet, so is never used to resolve those addresses. The 10G is static with no forwarder or dns address. As for IPv6 I don’t want to complicate things by even assigning an address. If it can’t negotiate on it’s own, then wouldn’t it just not use V6 and use V4 as the default?

Thanks for reminding me that Linux has logs, unlike windows. Nothing obvious in there and only one of the dozen logs has anything in it. It probably rolled the logs and I need to go enable the ‘failing’ config so it puts something useful in the current logs.

I’m not sure of the protocol here, so I’m in-lining the text.

testparm:

[global]
	bind interfaces only = Yes
	interfaces = lo enp5s0 enp4s0
	log file = /var/log/samba/log.%m
	logging = file
	map to guest = Bad User
	max log size = 1000
	obey pam restrictions = Yes
	pam password change = Yes
	panic action = /usr/share/samba/panic-action %d
	passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
	passwd program = /usr/bin/passwd %u
	server min protocol = SMB3
	server multi channel support = Yes
	server role = standalone server
	server string = %h server (Samba, Ubuntu)
	unix password sync = Yes
	usershare allow guests = Yes
	idmap config * : backend = tdb


[zfs-1]
	comment = "Access to zfs-1"
	create mask = 0755
	force user = xxxx
	guest ok = Yes
	path = /tank/zfs-1
	read only = No

samba.conf:

[global]
workgroup = WORKGROUP
server string = %h server (Samba, Ubuntu)
#interfaces = lo enp5s0
interfaces = lo enp5s0 enp4s0
bind interfaces only = yes
server min protocol = SMB3
log file = /var/log/samba/log.%m
max log size = 1000
logging = file
panic action = /usr/share/samba/panic-action %d
server role = standalone server
obey pam restrictions = yes
unix password sync = yes
passwd program = /usr/bin/passwd %u
passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
pam password change = yes
map to guest = bad user
usershare allow guests = yes

[zfs-1]
	comment = "Access to zfs-1"
	path = /tank/zfs-1
	writeable = yes
	public = yes
	create mask = 0755
	directory mask = 0755
	force user = xxx

I had to install net tools, but here is route from the ubuntu side:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    102    0        0 enp4s0
0.0.0.0         127.0.0.1       0.0.0.0         UG    20103  0        0 enp5s0
127.0.0.1       0.0.0.0         255.255.255.255 UH    20103  0        0 enp5s0
169.254.0.0     0.0.0.0         255.255.0.0     U     1000   0        0 enp4s0
192.168.1.0     0.0.0.0         255.255.255.0   U     102    0        0 enp4s0
192.168.5.0     0.0.0.0         255.255.255.0   U     103    0        0 enp5s0

And route print -4 from the windows side. I have no idea where it got the persistent address from, it doesn’t exist.

 IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0      192.168.1.1    192.168.1.238      4
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    331
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    331
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    331
      192.168.1.0    255.255.255.0         On-link     192.168.1.238    258
    192.168.1.238  255.255.255.255         On-link     192.168.1.238    258
    192.168.1.255  255.255.255.255         On-link     192.168.1.238    258
      192.168.5.0    255.255.255.0         On-link      192.168.5.10    271
     192.168.5.10  255.255.255.255         On-link      192.168.5.10    271
    192.168.5.255  255.255.255.255         On-link      192.168.5.10    271
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    331
        224.0.0.0        240.0.0.0         On-link     192.168.1.238    258
        224.0.0.0        240.0.0.0         On-link      192.168.5.10    271
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    331
  255.255.255.255  255.255.255.255         On-link     192.168.1.238    258
  255.255.255.255  255.255.255.255         On-link      192.168.5.10    271
===========================================================================
Persistent Routes:
  Network Address          Netmask  Gateway Address  Metric
          0.0.0.0          0.0.0.0   192.168.10.100  Default
===========================================================================

Ok, I made the ‘bad’ conf active to get some fresh data in the logs. I see these kind of messages now:

[2019/12/22 19:02:06.231635,  0] ../source3/smbd/service.c:774(make_connection_snum)
      canonicalize_connect_path failed for service zfs-1, path /tank/zfs-1
    [2019/12/23 22:29:21.692339,  0] ../source3/smbd/posix_acls.c:2081(create_canon_ace_lists)
      create_canon_ace_lists: unable to map SID S-xx-xxx to uid or gid.
[2019/12/29 15:31:21.988277,  0] ../source3/smbd/smbXsrv_client.c:656(smbXsrv_client_connection_pass_loop)

smbXsrv_client_connection_pass_loop: got connection sockfd[39]

@zlynx, @oO.o, Sorry I replied to my own post there. I’m not very social-media proficient. :wink:

1 Like

Here is my output of testparm which works for me. You could try changing some of yours to match. One thing I am suspicious of right away is that smb3 multichannel setting. The last I heard it was experimental and did not work very reliably. Another is that I am using security = USER. Which should be the default setting, but testparm outputs it on mine and not on yours? So it makes me wonder. And I don’t remember, but I might have had to use smbpasswd to set my user password for Samba.

# testparm
Load smb config files from /etc/samba/smb.conf
WARNING: The “allocation roundup size” option is deprecated
Loaded services file OK.
Server role: ROLE_STANDALONE

Press enter to see a dump of your service definitions

# Global parameters
[global]
disable spoolss = Yes
load printers = No
log file = /var/log/samba/log.%m
logging = systemd
max log size = 50
max stat cache size = 0
min receivefile size = 16384
printcap name = /dev/null
security = USER
server min protocol = SMB3
server signing = required
server string = Catbox Server
socket options = IPTOS_LOWDELAY TCP_NODELAY
unix extensions = No
workgroup = ZLYNX
idmap config * : backend = tdb
acl allow execute always = Yes
aio read size = 16384
aio write size = 16384
allocation roundup size = 4096
case sensitive = No
hosts allow = 127. 10.1.10. 2603:300b:8c5::0/64
map archive = No
printing = bsd
strict locking = No
use sendfile = Yes
wide links = Yes

[homes]
browseable = No
comment = Home Directories
read only = No
strict allocate = Yes
vfs objects = btrfs

I don’t see anything obviously wrong with your smb.conf which makes me think that maybe it’s an issue with Windows which is not my wheel house.

The only thing that comes to mind is that I’ve had stale server mappings cause problems after reinstalling/reconfiguring an smb server. I ran something in command to wipe old file share mappings (and then successfully remap), but I don’t remember what it was. A Windows user should be able to help you more than me.

Again, thank you for trying to help, but you are approaching this from a samba-no-worky POV. Samba on 1G works peachy fine and file access is a non-issue. This conf worked peachy-fine on both 1G and 10G before I had to re-install ubuntu.

Thanks to Linux having a sane way to store settings, it was easy to restore. In theory, even the win-10 side hasn’t changed during the re-install of ubuntu. But with windows, who knows?

The issue isn’t that samba is broken or that either 1G or 10G is broken in all protocols, but that samba/win-10 uniquely refuses to play on the 10G interface, when everything else loves the 10G interface (at 300 MB/s) and SMB loves the 1G but won’t play on the 10G at all. The server name won’t even show up when typed-in manually.

The only thing I can see different per samba, is when I remove the 1G as an option and there is no choice but to use the 10G. The config is the same otherwise.

Therefore, the ‘problem’ would reside ‘outside’ samba ie; samba is trying to DNS a route on the static subnet which has no dns names and it causes samba to reject that subnet. And yet, no errors or logs indicate anything like that.

Also, I don’t use any DNS names to need looking-up. Whenever it wants an IP, I give it the IP directly with no domain name. The SMB only knows it by the server name that samba gives it, and I assume it doesn’t change names depending on interface.

But SMB is microsoft and they use net-bios and silly things like that, so it wouldn’t surprise me that there are a ton of SMB gotchas on top of the samba requirements.

I’d be more than happy to put everything on the same subnet, with DHCP, DNS, Internet etc… Anyone got a 10GBE x 2 + 1Gbe x 8 switch they want to give me? This is so annoying, it’s got M$ written all over it as the real problem.

Sorry for the rant.

Sadly, I came here hoping otherwise, but I agree that it might be a hidden SMB requirement / quirk that win-10 demands, but isn’t obvious.

Whats the actual error you get in windows when you try to browse to it via the 10G?

Something you might not have considered… If you’re on 10 pro you can mount NFS now. Might be better for throughput as well.

Control panel > Turn features on and off > Services for NFS

There isn’t an ‘error’ per se, it just hangs for a while and then says it doesn’t exist. I’ve never had the samba server ‘visible’ and don’t have any reason to. I use a logon .cmd script to map it to a drive letter and it’s many times faster and less bother than doing it manually or having win-10 stumble around and re-connect it.

If I were to ask win-10 to ‘map a network drive’ from explorer, I can type \\nas (the samba server name) and click browse, and all the SMB path is available. If I hit browse with no server clue it just stumbles around. That’s on the 1G (working) link. On the 10G link, I get this:

Screenshot%20(29)

As for using NFS, I currently have a ‘work-around’, the 1G link SMB works fine, and so does the 10G ISCSI, SSH, SCP etc… I’m trying to fix an odd problem, not find a way to sweep it under the carpet (so to speak).

I have the ‘feeling’ it’s related to DNS or net-bios name resolution not functioning on that static link and SMB chokes because it can’t just connect the IP like a proper net app, when told everything it needs to know.

1 Like

Yeah, try using the IP instead of UNC.

I’ve migrated all the shares at work to IP because UNC seems to randomly break and I couldnt figure out why.

I’ve never used the IP version of the UNC and it worked, so I’m going to tag this as solved because I will be away from the computer for the rest of today. At least I know it’s a name problem now, not an SMB issue, so I can fuss with that later. For now, I have my sweet 10G and crushed SMB to my will!

1 Like

If you figure it out let me know because I beat my head over that one for a while before I just resorted to using IP.

Generally you can tell if it wants to work or not by seeing if all your machines show up properly in the network browser. Sometimes they all show up… sometimes only half… sometimes only the printers. Windows 10 doesnt seem to work well in that regard.

I’ll try to remember to come back and post results if I find something, thanks for the help.

@Adubs, I just found this link. Interesting stuff midway down and beyond. Now I really have to jump in the shower.

http://woshub.com/network-computers-not-showing-windows-10/

1 Like

Maybe @Novasty knows some kind of black magic trick on this one.

I just got lazy and stuck with IP since it always worked.

Just scrolled through this quickly, UNC is a bit hit/miss on linux to windows, generally a work around is either editing the host file or editing the DNS server.

1 Like

@Novasty @Adubs, I did a similar trial where I added the 10G interface name directly in the windows host file and it was equivalent to using the IP. I can’t use the DNS or it wouldn’t have been an issue in the first place.

The 10G segment has no physical way to connect to the 1G segment, so the DNS has no way to talk to it or assign an IP for the segment it’s on. The point being that using an IP directly (messy for many reasons) or a named entry in the host file both work properly and consistently, so the host file method has been working flawlessly for a while now.

Eventually the correct solution is to obtain a switch that merges the 10G and 1G physical networks into a single logical subnet with a single standard DNS server. Being poor is annoying.